...
Code Block | ||
---|---|---|
| ||
type StringField inteface { extract(segmentOffsets []int32) []string serialize() []bytes deserialize([]bytes) } func Filter(expression string, field StringField) sgementOffsets []int32 |
The extract interface on Stringfield can retrieve the corresponding String according to the provided segment offsets.
The function Filter calculates the segment offsets on the Stringfield based on the expression string.
The serialize method serializes itself into a slice of bytes, which is convenient to store in ObjectStroage as an index file.
The deserialize method deserializes the index file into a Stringfield object.
The following gives a C++ definition of HistoricalStringField.
Code Block | ||
---|---|---|
| ||
class HistoricalStringField1 {
std::vector<string> strs;
std::unordered_map<int32, std::vector<int32>> strOffsetToSegOffsets;
std::vector<int32> segOffsetToStrOffset;
std::vector<string>
extract(const std::vector<int32>& segmentOffsets);
std::vector<Blob>
serialize();
void
deserialize(const std::vector<Blob>&)
}
class Blob {
std::vector<char> blob_;
} |
The strOffsetToSegOffsets represents the mapping from String offset to segment offset. A string can appear in multiple entities, so the value type here is a vector.
SegOffsetToStrOffset represents the mapping from segment offset to String offset. Using string offset, we can retrieve the original string from strs.
Thus, opeations ("==", "!=", "<", "<=", ">", ">=") are transformed into binary search on strs to get the corresponding string offset, and then converted to segment offsets according to strOffsetToSegOffsets.
For the extract interface, you only need to retrieve the corresponding String according to segment offsets and segOffsetToStrOffset.
When there is no index file in the object store, QueryNode loads the original data from the object store and creates a Stringfield object based on the original data.
When the index file exists, QueryNode loads the index file from the object store, call the deserialize method of Stringfield, and generates a Stringfield object.