On this page
tf.raw_ops.StringNGrams
Creates ngrams from ragged string data.
tf.raw_ops.StringNGrams(
data,
data_splits,
separator,
ngram_widths,
left_pad,
right_pad,
pad_width,
preserve_short_sequences,
name=None
)
This op accepts a ragged tensor with 1 ragged dimension containing only strings and outputs a ragged tensor with 1 ragged dimension containing ngrams of that string, joined along the innermost axis.
| Args | |
|---|---|
data |
A Tensor of type string. The values tensor of the ragged string tensor to make ngrams out of. Must be a 1D string tensor. |
data_splits |
A Tensor. Must be one of the following types: int32, int64. The splits tensor of the ragged string tensor to make ngrams out of. |
separator |
A string. The string to append between elements of the token. Use "" for no separator. |
ngram_widths |
A list of ints. The sizes of the ngrams to create. |
left_pad |
A string. The string to use to pad the left side of the ngram sequence. Only used if pad_width != 0. |
right_pad |
A string. The string to use to pad the right side of the ngram sequence. Only used if pad_width != 0. |
pad_width |
An int. The number of padding elements to add to each side of each sequence. Note that padding will never be greater than 'ngram_widths'-1 regardless of this value. If pad_width=-1, then add max(ngram_widths)-1 elements. |
preserve_short_sequences |
A bool. |
name |
A name for the operation (optional). |
| Returns | |
|---|---|
A tuple of Tensor objects (ngrams, ngrams_splits). |
|
ngrams |
A Tensor of type string. |
ngrams_splits |
A Tensor. Has the same type as data_splits. |
© 2022 The TensorFlow Authors. All rights reserved.
Licensed under the Creative Commons Attribution License 4.0.
Code samples licensed under the Apache 2.0 License.
https://www.tensorflow.org/versions/r2.9/api_docs/python/tf/raw_ops/StringNGrams