Beginning Cloud and Big Data: New high-performance services for AI training

Beginning Cloud and Big Data: New high-performance services for AI training


HPC + Mass Storage = KI
Genesis Cloud and Vast go together

Topic providers

Genesis Cloud will use the Vast Data Platform to store large amounts of data generated by AI training. CloudComputing Insider spoke with Vast founder Jeff Denworth and Genesis Cloud CTO Dave Hughes about the new high-performance shared services, which in the medium term should also lead to a search engine for unstructured data.

Genesis Cloud will use the Vast Data Platform to store large amounts of data generated by AI training.

(Image: free license, geralt / Pixabay)

Cloud Genesis (GC) – not to be confused with Genesys Cloud, an American service provider focused on contact center products with German headquarters also in Munich – is a relatively young company that offers Infrastructure-as-a-Service (IaaS) for GPU-based Clouds. . Something like this is needed in high performance environments, for example, but especially to train large language models (LLMs). This creates a large amount of data, which will be stored on the Vast data platform in the future. Both companies have announced their partnership.

By joining forces, we can now also fully serve business customers, said Dave Hughes in an interview with CloudComputing Insider. He is the Vice President of Engineering at GC – a kind of CTO, as he puts it himself – and is responsible for research and development, products, infrastructure and operations. He prefers to work from the UK – although GC is a German company, it is well positioned internationally.

Dave Hughes in conversation with CloudComputing Insider via Zoom.  He is the Vice President of Engineering at Genesis Cloud.
Dave Hughes in conversation with CloudComputing Insider via Zoom. He is the Vice President of Engineering at Genesis Cloud.

(Photo: Müller)

Better than hyperscalers?

“By storing on the Big Data Platform, we can provide an automated infrastructure with exceptional performance in parallel with virtualization,” said Hughes. Why users should not choose a hyperscaler immediately? “Our offer is cheaper, more convenient, ready to use more quickly and, above all, there is no risk of foreclosure of the seller,” explains Hughes. “In addition, the customer has complete freedom in the design,” explains Jeff Denworth, co-founder of Vast and responsible for Technical Sales and Marketing.

The shared GC and Vast system is also multi-tenant, meaning that several different users can access the data storage through the public cloud. “Try S3,” says Denworth. To create data in multi-tenant S3, it will have to be distributed in buckets and keys. Each AWS storage technology also has its own set of partitioning models. VAST Data Platform’s data parallelism, on the other hand, automatically segments and labels data across multiple protocols. “In short: you definitely don’t want to store a lot of data generated during AI training on S3,” Denworth says.

Jeff Denworth, co-founder of Vast, in a video conference with CloudComputing Insider.
Jeff Denworth, co-founder of Vast, in a video conference with CloudComputing Insider.

(Photo: Müller)

How exactly do partners want to collaborate?

“Based on our platform, namespaces can be created that can span multiple regions,” explains Vast’s founder. “This is perfectly compatible with GC technology, which provides a virtual cluster in a namespace with access controls and resources such as local DNS names, shells and services,” says Denworth.

In addition, Genesis groups with their one, two or even three thousand GPUs are currently “state of the art”, and GC does not have a ton of outdated equipment at the start, according to Denworth’s assessment of his new partner. Newly installed and comparatively energy efficient GC data centers are located in Iceland and Norway. They feature a non-blocking leaf-spine architecture based on advanced switches. Each server is connected to a switch via network cards, and each account has its own dedicated network for added privacy and security. The GPU clusters, for example, use the new Nvidia H100 Tensor Core GPU. This is how LLM training makes sense, unlike in legacy data centers, says Denworth.

Vast also has no inheritance with him. It is not surprising, since the company is smaller than Genesis Cloud – it was founded in 2019 by Denworth, among others. It provides an infrastructure designed from the ground up for deep learning and GPU-accelerated data centers and clouds. Seen like this, the GC fits Vast like a glove.

Medium-sized companies and enterprises as new customers

Hughes for his part continued to report that the main thing in the launch of Genesis Cloud was: cheap price they had to earn points because customers had enthusiastic but unfunded users. With the advent of AI, this has fundamentally changed; GPU clusters are mostly used by medium-sized and large companies. And for this, as I said, Hughes needs a stable and fast storage platform that is also compatible with Genesis Cloud technology.

According to Hughes and Denworth, this common system will have to handle data on dimensions that were previously unimaginable. And what they both plan beyond that sounds pretty interesting: Together they want nothing more than to capture all the unstructured data on the Internet and store it in a classified way. As it is known, the Internet has 90 percent of this, “but no database in the world has been able to get information from a Zoom meeting, for example,” points out Denworth. The joint goal of the partnership is to make this possible using AI. In particular, there are plans to develop a search engine for unstructured data. That would be truly revolutionary, we are looking at it.

(ID: 49888729)