libcudf  24.02.00
Files | Classes | Functions

Files

file  contiguous_split.hpp
 Table APIs for contiguous_split, pack, unpack, and metadata.
 
file  contiguous_split.hpp
 Table APIs for contiguous_split, pack, unpack, and metadata.
 
file  copying.hpp
 Column APIs for gather, scatter, split, slice, etc.
 

Classes

struct  cudf::packed_columns
 Column data in a serialized format. More...
 
struct  cudf::packed_table
 The result(s) of a cudf::contiguous_split. More...
 
class  cudf::chunked_pack
 Perform a chunked "pack" operation of the input table_view using a user provided buffer of size user_buffer_size. More...
 

Functions

std::vector< packed_tablecudf::contiguous_split (cudf::table_view const &input, std::vector< size_type > const &splits, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Performs a deep-copy split of a table_view into a vector of packed_table where each packed_table is using a single contiguous block of memory for all of the split's column data. More...
 
packed_columns cudf::pack (cudf::table_view const &input, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Deep-copy a table_view into a serialized contiguous memory format. More...
 
std::vector< uint8_t > cudf::pack_metadata (table_view const &table, uint8_t const *contiguous_buffer, size_t buffer_size)
 Produce the metadata used for packing a table stored in a contiguous buffer. More...
 
table_view cudf::unpack (packed_columns const &input)
 Deserialize the result of cudf::pack. More...
 
table_view cudf::unpack (uint8_t const *metadata, uint8_t const *gpu_data)
 Deserialize the result of cudf::pack. More...
 
std::vector< column_viewcudf::split (column_view const &input, host_span< size_type const > splits, rmm::cuda_stream_view stream=cudf::get_default_stream())
 Splits a column_view into a set of column_views according to a set of indices derived from expected splits. More...
 
std::vector< column_viewcudf::split (column_view const &input, std::initializer_list< size_type > splits, rmm::cuda_stream_view stream=cudf::get_default_stream())
 Splits a column_view into a set of column_views according to a set of indices derived from expected splits. More...
 
std::vector< table_viewcudf::split (table_view const &input, host_span< size_type const > splits, rmm::cuda_stream_view stream=cudf::get_default_stream())
 Splits a table_view into a set of table_views according to a set of indices derived from expected splits. More...
 
std::vector< table_viewcudf::split (table_view const &input, std::initializer_list< size_type > splits, rmm::cuda_stream_view stream=cudf::get_default_stream())
 Splits a table_view into a set of table_views according to a set of indices derived from expected splits. More...
 

Detailed Description

Function Documentation

◆ contiguous_split()

std::vector<packed_table> cudf::contiguous_split ( cudf::table_view const &  input,
std::vector< size_type > const &  splits,
rmm::mr::device_memory_resource mr = rmm::mr::get_current_device_resource() 
)

Performs a deep-copy split of a table_view into a vector of packed_table where each packed_table is using a single contiguous block of memory for all of the split's column data.

The memory for the output views is allocated in a single contiguous rmm::device_buffer returned in the packed_table. There is no top-level owning table.

The returned views of input are constructed from a vector of indices, that indicate where each split should occur. The ith returned table_view is sliced as [0, splits[i]) if i=0, else [splits[i], input.size()) if i is the last view and [splits[i-1], splits[i]] otherwise.

For all i it is expected splits[i] <= splits[i+1] <= input.size(). For a splits size N, there will always be N+1 splits in the output.

Note
It is the caller's responsibility to ensure that the returned views do not outlive the viewed device memory contained in the all_data field of the returned packed_table.
Example:
input: [{10, 12, 14, 16, 18, 20, 22, 24, 26, 28},
{50, 52, 54, 56, 58, 60, 62, 64, 66, 68}]
splits: {2, 5, 9}
output: [{{10, 12}, {14, 16, 18}, {20, 22, 24, 26}, {28}},
{{50, 52}, {54, 56, 58}, {60, 62, 64, 66}, {68}}]
Exceptions
cudf::logic_errorif splits has end index > size of input.
cudf::logic_errorWhen the value in splits is not in the range [0, input.size()).
cudf::logic_errorWhen the values in the splits are 'strictly decreasing'.
Parameters
inputView of a table to split
splitsA vector of indices where the view will be split
mrAn optional memory resource to use for all returned device allocations
Returns
The set of requested views of input indicated by the splits and the viewed memory buffer

◆ pack()

Deep-copy a table_view into a serialized contiguous memory format.

The metadata from the table_view is copied into a host vector of bytes and the data from the table_view is copied into a device_buffer. Pass the output of this function into cudf::unpack to deserialize.

Parameters
inputView of the table to pack
mrAn optional memory resource to use for all returned device allocations
Returns
packed_columns A struct containing the serialized metadata and data in contiguous host and device memory respectively

◆ pack_metadata()

std::vector<uint8_t> cudf::pack_metadata ( table_view const &  table,
uint8_t const *  contiguous_buffer,
size_t  buffer_size 
)

Produce the metadata used for packing a table stored in a contiguous buffer.

The metadata from the table_view is copied into a host vector of bytes which can be used to construct a packed_columns or packed_table structure. The caller is responsible for guaranteeing that all of the columns in the table point into contiguous_buffer.

Parameters
tableView of the table to pack
contiguous_bufferA contiguous buffer of device memory which contains the data referenced by the columns in table
buffer_sizeThe size of contiguous_buffer
Returns
Vector of bytes representing the metadata used to unpack a packed_columns struct

◆ split() [1/4]

std::vector<column_view> cudf::split ( column_view const &  input,
host_span< size_type const >  splits,
rmm::cuda_stream_view  stream = cudf::get_default_stream() 
)

Splits a column_view into a set of column_views according to a set of indices derived from expected splits.

The returned view's of input are constructed from vector of splits, which indicates where the split should occur. The ith returned column_view is sliced as [0, splits[i]) if i=0, else [splits[i], input.size()) if i is the last view and [splits[i-1], splits[i]] otherwise.

For all i it is expected splits[i] <= splits[i+1] <= input.size() For a splits size N, there will always be N+1 splits in the output

Note
It is the caller's responsibility to ensure that the returned views do not outlive the viewed device memory.
Example:
input: {10, 12, 14, 16, 18, 20, 22, 24, 26, 28}
splits: {2, 5, 9}
output: {{10, 12}, {14, 16, 18}, {20, 22, 24, 26}, {28}}
Exceptions
cudf::logic_errorif splits has end index > size of input.
cudf::logic_errorWhen the value in splits is not in the range [0, input.size()).
cudf::logic_errorWhen the values in the splits are 'strictly decreasing'.
Parameters
inputView of column to split
splitsIndices where the view will be split
streamCUDA stream used for device memory operations and kernel launches
Returns
The set of requested views of input indicated by the splits

◆ split() [2/4]

std::vector<column_view> cudf::split ( column_view const &  input,
std::initializer_list< size_type splits,
rmm::cuda_stream_view  stream = cudf::get_default_stream() 
)

Splits a column_view into a set of column_views according to a set of indices derived from expected splits.

The returned view's of input are constructed from vector of splits, which indicates where the split should occur. The ith returned column_view is sliced as [0, splits[i]) if i=0, else [splits[i], input.size()) if i is the last view and [splits[i-1], splits[i]] otherwise.

For all i it is expected splits[i] <= splits[i+1] <= input.size() For a splits size N, there will always be N+1 splits in the output

Note
It is the caller's responsibility to ensure that the returned views do not outlive the viewed device memory.
Example:
input: {10, 12, 14, 16, 18, 20, 22, 24, 26, 28}
splits: {2, 5, 9}
output: {{10, 12}, {14, 16, 18}, {20, 22, 24, 26}, {28}}
Exceptions
cudf::logic_errorif splits has end index > size of input.
cudf::logic_errorWhen the value in splits is not in the range [0, input.size()).
cudf::logic_errorWhen the values in the splits are 'strictly decreasing'.
Parameters
inputView of column to split
splitsIndices where the view will be split
streamCUDA stream used for device memory operations and kernel launches
Returns
The set of requested views of input indicated by the splits

◆ split() [3/4]

std::vector<table_view> cudf::split ( table_view const &  input,
host_span< size_type const >  splits,
rmm::cuda_stream_view  stream = cudf::get_default_stream() 
)

Splits a table_view into a set of table_views according to a set of indices derived from expected splits.

The returned views of input are constructed from vector of splits, which indicates where the split should occur. The ith returned table_view is sliced as [0, splits[i]) if i=0, else [splits[i], input.size()) if i is the last view and [splits[i-1], splits[i]] otherwise.

For all i it is expected splits[i] <= splits[i+1] <= input.size() For a splits size N, there will always be N+1 splits in the output

Note
It is the caller's responsibility to ensure that the returned views do not outlive the viewed device memory.
Example:
input: [{10, 12, 14, 16, 18, 20, 22, 24, 26, 28},
{50, 52, 54, 56, 58, 60, 62, 64, 66, 68}]
splits: {2, 5, 9}
output: [{{10, 12}, {14, 16, 18}, {20, 22, 24, 26}, {28}},
{{50, 52}, {54, 56, 58}, {60, 62, 64, 66}, {68}}]
Exceptions
cudf::logic_errorif splits has end index > size of input.
cudf::logic_errorWhen the value in splits is not in the range [0, input.size()).
cudf::logic_errorWhen the values in the splits are 'strictly decreasing'.
Parameters
inputView of a table to split
splitsIndices where the view will be split
streamCUDA stream used for device memory operations and kernel launches
Returns
The set of requested views of input indicated by the splits

◆ split() [4/4]

std::vector<table_view> cudf::split ( table_view const &  input,
std::initializer_list< size_type splits,
rmm::cuda_stream_view  stream = cudf::get_default_stream() 
)

Splits a table_view into a set of table_views according to a set of indices derived from expected splits.

The returned views of input are constructed from vector of splits, which indicates where the split should occur. The ith returned table_view is sliced as [0, splits[i]) if i=0, else [splits[i], input.size()) if i is the last view and [splits[i-1], splits[i]] otherwise.

For all i it is expected splits[i] <= splits[i+1] <= input.size() For a splits size N, there will always be N+1 splits in the output

Note
It is the caller's responsibility to ensure that the returned views do not outlive the viewed device memory.
Example:
input: [{10, 12, 14, 16, 18, 20, 22, 24, 26, 28},
{50, 52, 54, 56, 58, 60, 62, 64, 66, 68}]
splits: {2, 5, 9}
output: [{{10, 12}, {14, 16, 18}, {20, 22, 24, 26}, {28}},
{{50, 52}, {54, 56, 58}, {60, 62, 64, 66}, {68}}]
Exceptions
cudf::logic_errorif splits has end index > size of input.
cudf::logic_errorWhen the value in splits is not in the range [0, input.size()).
cudf::logic_errorWhen the values in the splits are 'strictly decreasing'.
Parameters
inputView of a table to split
splitsIndices where the view will be split
streamCUDA stream used for device memory operations and kernel launches
Returns
The set of requested views of input indicated by the splits

◆ unpack() [1/2]

table_view cudf::unpack ( packed_columns const &  input)

Deserialize the result of cudf::pack.

Converts the result of a serialized table into a table_view that points to the data stored in the contiguous device buffer contained in input.

It is the caller's responsibility to ensure that the table_view in the output does not outlive the data in the input.

No new device memory is allocated in this function.

Parameters
inputThe packed columns to unpack
Returns
The unpacked table_view

◆ unpack() [2/2]

table_view cudf::unpack ( uint8_t const *  metadata,
uint8_t const *  gpu_data 
)

Deserialize the result of cudf::pack.

Converts the result of a serialized table into a table_view that points to the data stored in the contiguous device buffer contained in gpu_data using the metadata contained in the host buffer metadata.

It is the caller's responsibility to ensure that the table_view in the output does not outlive the data in the input.

No new device memory is allocated in this function.

Parameters
metadataThe host-side metadata buffer resulting from the initial pack() call
gpu_dataThe device-side contiguous buffer storing the data that will be referenced by the resulting table_view
Returns
The unpacked table_view