2018年3月21日 星期三

What is the essence of a data center migration hyper fusion architecture?

Data center migration, we do not judge right and wrong, nor is to explore the real super fusion, but again check, think about why are now focused on super fusion, what kind of IT infrastructure more "suitable" in today's business.
 Data centre migration
First of all, the reason for choosing the superfusion architecture is that traditional storage cannot solve the problem of enterprise data center.
According to the McKinsey study, global IT data are increasing at a rate of 40% a year.Data is gradually affecting business, and enterprises make decisions and management through data analysis.
Alone, however, more and more faster and faster, auditing of the CPU is not enough, the bottleneck is that traditional storage disk is too slow, most computing capacity idle CPU or waiting to store data transmission to come over.Traditional storage capacity and performance do not have the scalability to match the "computing power", which can not meet the requirement of enterprise data access.
The problem is not new.Google has had this problem early.So how does GuGe do it?
As an enterprise that provides data retrieval to Internet users around the world, GuGe has considered EMC, IBM, and SUN storage products of that year, but it can't solve its problems.Neither the capacity nor the performance, the products of these companies can satisfy the scale demand of Google.So Google can only build a storage structure for its own data search.
Google excellent computer scientists, has broken the traditional thinking of storage using the server's local hard disk and software built a capacity and performance is scalable distributed file system, and built its search and analysis on the calculation of engine: don't have to remove the data from the store, and then through the network transmission to the computing side, but the operation on the distributed computing directly into the store, to transmit "computing" as a unit for transmission, such large amounts of data storage is local access, no longer need to across the network transmission, natural visit soon.As a result, naturally, "computing" and "storage" run (" fusion ") on a server, here we can see one of the strengths of the super fusion architecture is that local access to data, do not have to across a network.
The amount of data that modern enterprises bigger and bigger, more and more applications, they began to face problems when Google, cios must consider how to more efficient to build their own computing and storage infrastructure, to meet the demand of the application data access.
Virtual into easier management applications, it solves the problem of CPU, memory resources idle.But with the large-scale application of virtualization, virtual machines are becoming more and more, and virtual machines are running slower and slower on traditional storage."Slow" causes "poor experience" and "poor experience" becomes the biggest bottleneck restricting the application of virtualization.Here the nature is the most important reasons, the storage I/O performance is not enough, a large number of virtual machine and container running at the same time, the I/O hybrid, the random read/write a sharp increase in the structure of traditional storage can't afford a lot of random I/o. super fusion is precisely in order to solve this problem, was taken to the field of virtualization and container.At the same time, the industry also has different ways to solve the problem of I/O, we try to analyze the other solution: to solve method: do the Cache in a storage device adopts the SSD, accelerated I/o. this under certain scale may be effective, but the storage devices SSD Cache ratio is small, usually less than 5% of the capacity than the case, it is difficult to meet the user's thermal data Cache requirements.In addition, it is still impossible to expand on demand, and all data still flows out of the centralized storage controller, which is bound to block the "highway".
Solution 2: using the server side SSD Cache, accelerated I/o. a similar solution, usually lack the support of high reliability software on the server side Cache if used to write Cache, a single point of failure problem, need to be in more than one server Cache devices, do transcripts to provide reliability, can say this is a super gimped fusion architecture, the Cache on the server side, still use traditional storage, when the Cache is full, need to be written back to the traditional storage, is still the "controller" of traditional storage limit the overall performance.
We see, the above two solutions are limited by the structure of traditional storage, storage is not super fusion, through the completely get rid of traditional storage, distributed file system is used to provide the "limit" performance and capacity, on this basis, through the Cache to accelerate again, even all use flash memory (full flash memory products) to build is a natural, not be restricted.
Super fusion architecture, therefore, not for a single server storage quickly, but in order for each additional servers, storage performance is linear, so don't limit the storage structure of the operation of the enterprise business, and ensure the reliability of the business.
Because of this extensive sharing of storage, the whole Google business is running smoothly.What SMARTX is doing is a better, more stable base service.
In addition, the reason for the rapid development of superfusion in recent years is due to the hardware equipment.More and more number of CPU cores, the server's memory capacity is more and more big, the SSD equipment and Internet network equipment is more and more fast, which means: a. the server resources in addition to run the business, you can still set aside enough CPU and memory resources to run the storage software.Running the storage software and business together reduces the amount of equipment, reduces the power usage, and the local reading improves the access efficiency of I/O.This was not possible a few years ago, because CPU and memory were too limited.
B. interconnection network is more and more quickly, whether Wan Zhao, 40 gb Ethernet, or Infiniband (unlimited broadband technology), which makes the software we will separate storage devices for interconnection, the distributed file system the formation of a Shared storage pool, for the use of the upper application.
C. if the SSD to hardware manufacturers such as a single storage device to run faster, the meaning of our software is that more than a lot of these storage devices, work together to provide endless overall performance and capacity.

沒有留言:

張貼留言