Pacific Teck offers three parallel file system storage software products:
BeeGFS transparently spreads user data across multiple servers. By increasing the number of servers and disks in the system, you can simply scale performance and capacity of the file system to the level that you need, seamlessly from small clusters up to enterprise-class systems with thousands of nodes.
The flexibility, robustness, and outstanding performance of BeeGFS help our customers around the globe to increase productivity by delivering results faster and by enabling new data analysis methods that were not possible without the advantages of BeeGFS.
Pacific Teck is the official Platinum ThinkParQ Partner in Asia. Installation & Support is offered entirely by Pacific Teck and through our partner SIs. We have experience in installing BeeGFS at some of the largest sites in Asia, and work back to back with ThinkParQ for source code level fixes.
BeeGFS offers maximum performance and scalability on various levels. It supports distributed file contents with flexible striping across the storage servers on a by file or by directory base as well as distributed metadata.
BeeGFS is optimized especially for use in environments where performance matters to provide:
Best in class client throughput: 8 GB/s with only a single process streaming on a 100GBit network, while a few streams can fully saturate the network.
Best in class metadata performance: Linear scalability through dynamic metadata namespace partitioning.
Best in class storage throughput: BeeGFS servers allow flexible choice of underlying file system to perfectly fit the given storage hardware.
BeeGFS supports a wide range of Linux distributions such as RHEL/Fedora, SLES/OpenSuse or Debian/Ubuntu as well as a wide range of Linux kernels from ancient 2.6.18 up to the latest vanilla.
The storage services run on top of an existing local filesystem (such as xfs, zfs or others) using the normal POSIX interface and clients and servers can be added to an existing system without downtime.
BeeGFS supports multiple networks and dynamic failover in case one of the network connections is down.
BeeGFS is widely popular among universities and the global research community, powering some of the fastest supercomputers in the world to help scientists analyze large amounts of data efficiently every day.
BeeGFS is the parallel file system of choice in life sciences. The fast growing amount of genomics data to store and analyze quickly in fields like Precision Medicine make BeeGFS the first choice for our customers.
BeeGFS is used in many different industries all around the globe to provide fast access to storage systems of all kinds and sizes, from small scale up to enterprise-class systems with thousands of hosts.
BeeOND stands for BeeGFS on Demand, and is a complementary product to BeeGFS but can be used with other file systems as well. It is typically used to aggregate the performance and capacity of internal SSDs or hard disks in compute nodes for the duration of a compute job. This provides additional performance and a very elegant way of burst buffering. The way it works is to create one or multiple instances of BeeGFS on these compute nodes that can be created and destroyed “on-demand.”
BeeOND on Compute Nodes
Modern compute nodes are often rich in SSDs and NVMe that are under utilized. By grouping unused high-speed resources together, a space is created where users can process some or all of their data much faster than on the standard hard disk based file system.
The main advantages of the typical BeeOND use-case on compute nodes are:
BeeOND uses existing NVMes and SSDs in the system, even space from SSDs shared with the OS. Many competitive burst buffer solutions require purchasing a new layer of expensive hardware, but BeeOND uses resources that already exist.
Pacific Teck is the official ThinkParQ Platinum Partner in Asia. Installation & Support is offered entirely by Pacific Teck and through our partner SIs. We have experience with the largest sites in Asia, and work back to back with ThinkParQ for source code level fixes. We recommend pairing with enterprise supported Univa Grid Engine.
A very easy way to remove I/O load and possibly nasty I/O patterns from your persistent global file system. Temporary data created during the job runtime will never need to be moved to your global persistent file system, anyways. But even the data that should be preserved after the job end might be better stored to a BeeOND instance initially and then at the end can be copied to the persistent global storage completely sequentially in large chunks for maximum bandwidth.
Applications can complete faster, because with BeeOND, they can be running on SSDs (or maybe even a RAM-disk), while they might only be running on spinning disks on your normal persistent global file system. Combining the SSDs of multiple compute nodes not only gets you to high bandwidth easily, it also gets you to a system that can handle very high IOPS.
Machine learning environments are often rich in NVMe resources for BeeOND to utilize. In Japan, Univa Grid Engine and BeeGFS/BeeOND are integrated at TiTech (540 Nodes with 4 P100 and 4 OPA HFIs per node ) and ABCI (1088 Nodes with 4 V100 and 2 EDR ports per node). Univa Grid Engine kicks off BeeOND and tells it how many NVMe to use, when to use it, and what to do after the job finishes. This is really an on demand burst buffer using the NVMe contained in compute nodes.
BeeOND help scientific clusters around the world achieve burst buffer level performance increases without breaking the budget. Together with Univa Grid Engine, this rich resource using existing hardware can be fairly shared among groups and users.
NVMesh was inspired by how Tech Giants like Amazon, Facebook and Google have redefined infrastructures for web-scale applications, leveraging standard servers and shared-nothing architectures to ensure maximum operational efficiency and flexibility. For their web-scale applications, enterprises and service providers are seeking to optimize their infrastructures in the same way as the Tech Giants. For storage, this means they want to deploy scale-out storage infrastructures leveraging standard servers and software-defined storage solutions.
Excelero’s NVMesh is the lowest latency distributed block storage for shared NVMe on the market. It’s a 100% software-defined solution that supports any hardware. Being pure block storage, NVMesh runs any local or distributed file system. NVMesh adds critical sets of capabilities that make it easier for enterprises and service providers to deploy shared NVMe storage at local performance across a far wider range of network protocols and applications.
NVMesh features a distributed block layer that allows unmodified applications to utilize pooled NVMe storage devices across a network at local speeds and latencies. Distributed NVMe storage resources are pooled with the ability to create arbitrary, dynamic block volumes that can be utilized by any host running the NVMesh block client. In short, applications can enjoy the latency, throughput and IOPs of a local NVMe device while at the same time getting the benefits of centralized, redundant storage. NVMesh is deployed as a virtual, distributed non-volatile array and supports both converged and disaggregated architectures, giving customers full freedom in their architectural design.
As most enterprise servers become NVMe-enabled, the rush is on to allow more teams to share NVMe SSD resources. Excelero’s NVMesh is a complete web-scale SDS solution with the distributed data protection and storage provisioning that make shared NVMe storage practical, efficient and readily managed.
MeshConnect™ features new support for traditional network technologies, giving NVMesh the widest selection of supported fabrics and protocols. Supported Protocols are TCP/IP, RDMA and Fibre Channel; supported fabrics include Ethernet, Fibre Channel and Infiniband.
MeshProtect™ is a flexible, distributed data protection architecture offering different protection levels, matching resiliency and performance to application needs. Options range from no redundancy, mirroring (N+1) to parity-based (N+M). The latter provides over 90% storage efficiency, yet delivers ultra low-latency performance on large-scale configurations.
MeshInspect™ provides performance analytics for pinpointing anomalies quickly and at scale. Customers benefit from elaborate cluster-wide and per-object performance and utilization statistics that help with the monitoring and analysis of the storage environment performance. Administrators can benefit from fully customizable display of detailed metrics of application workloads and datasets.
Excelero delivers low-latency distributed block storage for web-scale applications. NVMesh enables shared NVMe across any network and supports any local or distributed file system. The solution features an intelligent management layer that abstracts underlying hardware with CPU offload, creates logical volumes with redundancy, and provides centralized, intelligent management and monitoring. Applications can enjoy the latency, throughput and IOPs of a local NVMe device with the convenience of centralized storage while avoiding proprietary hardware lock-in and reducing the overall storage TCO.
NVMesh features a distributed block layer that allows unmodified applications to utilize pooled NVMe storage devices across a network at local speeds and latencies. Distributed NVMe storage resources are pooled with the ability to create arbitrary, dynamic block volumes that can be utilized by any host running the NVMesh block client.
Being a 100% software-based solution, NVmesh was built to give customers maximum flexibility in designing storage infrastructures. With MeshConnect™, customers can choose the network fabric and protocol that best meets their performance or efficiency requirements. NVMesh supports the widest selection of supported protocols and fabrics, including TCP/IP and Fibre Channel, InfiniBand, RoCE v2, RDMA and NVMe-oF. MeshProtect™ offers flexible protection levels for differing application needs, including mirrored and parity-based redundancy. MeshInspect™ provides performance analytics for pinpointing anomalies quickly and at scale. NVMesh is deployed as a virtual, distributed non-volatile array and supports both converged and disaggregated architectures, giving customers full freedom in their architectural design.
Local performance across the network.
Predictable application performance. Smart insights in utilization.
Maximize the utilization of your flash media. Reduce your capacity overhead. Easily manage & monitor.
Utilize any hardware. Use existing network infrastructure. Choose from multiple redundancy options.
Lustre is a type of a parallel distributed file system used for large-scale cluster computing. Parallel file systems allow for processing of large files by multiple nodes at once.
When properly tuned for performance, Lustre has fast performance. It is also fully open sourced with enterprise support options provided by Pacific Teck. Pacific Teck can provide local support with a back to back agreement with Intel, who can provide developer level fixes.
Don’t let storage be your bottle neck. As the compute side of the cluster gets more and more powerful, slow storage could prevent fully harnessing that performance. A high speed file system like Lustre, with a high speed interconnect like Omni-Path are critical elements to maximising cluster performance.
The drawback of Lustre is that it is not easy to install, tune and support. That is where Pacific Teck’s enterprise support adds value. Pacific Teck can analyze your cluster top to bottom and develop a custom build to get you started on the right foot. From there, we will work to make sure all settings, drivers, etc are working together properly to ensure maximum speed.
See Case Studies About Lustre
See News About Lustre