NVIDIA IndeX API nvidia_logo_transpbg.gif Up
Distributed Data Import Mechanism

Mechanism for efficient parallel and distributed data imports from arbitrary sources. More...

Classes

class  nv::index::IDistributed_data_import_callback
 Import callback mechanism for implementing distributed and parallel data loading from arbitrary sources. More...
 
class  nv::index::Distributed_discrete_data_import_callback< id1, ... >
 This mixin class can be used to implement the IDistributed_data_import_callback interface. More...
 
class  nv::index::Distributed_continuous_data_import_callback< id1, ... >
 This mixin class can be used to implement the IDistributed_data_import_callback interface. More...
 

Detailed Description

Mechanism for efficient parallel and distributed data imports from arbitrary sources.

NVIDIA IndeX enables the data imports through a distributed import callback mechanism. The import callback is triggered by nodes in the cluster in parallel. Those cluster nodes that require the import of a data subset (see Distributed Data Subsets) invoke the import calback for the data subset. By means of such distributed data imports, large-scale datasets are never routed through a single cluster node, which would represent a major bottleneck, e.g., when loading terabytes of data. Instead, the import of data subsets runs in parallel on each cluster node and by all cluster nodes in parallel as well. As a result, NVIDIA IndeX is able to import terabytes of volume data within a 2-3 minutes rather than hours.

Besides this core mechanism that facilitates the accelerated data cluster-wide import of subset data, applications can implement import callbacks allowing for (1) the suppport of proprietary data formats and (2) any arbitrary data sources. Data source for import can obviously be file system but could also be remote storage nodes, Cloud storages (including AWS S3 cloud storage, Azure blob storage and GCP cloud bucket), or SQL databases, for example. Applications are also able to create synthetic data from a compute source.

An NVIDIA IndeX package ships extension that include importers, e.g., for the commone OpenVDB data representation, for the common VDS data format or for plain raw data volume data.