Google announced a revolutionary approach for businesses interested to leverage the potential of generative AI without worrying about data location, legal concerns, and real-time performance. Google Cloud has recently introduced Google Distributed Cloud (GDC), which enables businesses to incorporate Google’s AI services into clients’ own data centers or edges to offer a fully-managed hardware and software solution for data-intensive and AI workloads.
This new release seeks to create a link between the revolution of the cloud and the applicability that many industries are faced with. For the companies working in the industries with high regulatory demands or those which deal with the sensitive information, the provision of the on-premise information processing without the violation of the privacy is a great advantage. GDC has a diverse portfolio of services and server options that can be customized to suit various enterprise needs, ensuring that it can accommodate and process any level of AI intensity.
At the core of this solution is a new generation of servers specifically designed to leverage AI and packaged with NVIDIA H100 Tensor Core GPUs, Hopper architecture and Intel Xeon Scalable 5th Generation processors. These servers are designed to run LLMs with up to 100 billion parameters, providing levels of computing power required for NLP, image analysis, and other applications of multimodal data. By having improved GPU profiling from NVIDIA MIG Profiles, it becomes easy to allocate resources to manage and own AI services that reduce ownership costs.
Undoubtedly, one of the highlights of GDC is the new generative AI search solution. Based on the open-source Gemma 2 language model with 9 billion parameters, this conversational search engine remains on-premises. It enables enterprises to efficiently consume and analyze sensitive data within their infrastructure and provides employees with an intuitive approach to data consumption using natural language search in such systems. The solution is designed with the utmost focus on transparency, as every produced response contains links to the source documents thus reducing the chances of getting it wrong or giving AI hallucinations.
This solution relies on the retrieval-augmentation generation architecture that blends the conventional search with generative AI to provide superior performance. To make the responses more accurate and semantically correct the system feeds on-premises data context into the user queries before they are processed by the language model. Combining Vertex AI pre-trained APIs makes it even more powerful and translates text in 105 languages, performs speech-to-text in 13 languages, and OCR in 46 languages it supports.
This search capability in GDC is complemented by AlloyDB Omni for semantic search and embeddings storage, as well as GDC’s open architecture that empowers businesses to build and deploy customized solutions. Such tools can be implemented through third-party databases, open-source models or through companies’ own systems, thus making it possible to have different formats for different scenarios, according to the organization’s needs.
Google’s Distributed Cloud solution of how enterprises can harness generative AI without the privacy of data leaving their sites can best be described as revolutionary. Today, for organizations willing to experience this groundbreaking change, the generative AI search solution becomes available in preview. Those who meet these requirements must contact their Google account managers and start this novel on-premises AI initiative.
Latest Stories:
Google Cloud Earns ISO/IEC 42001:2023, Advancing Responsible AI
Seqrite Boosts Cybersecurity with Predictive Analytics on AWS
Samsung Debuts Energy-Efficient AI Refrigerators with Hybrid Cooling