Reading Data
Learn the process for reading data from a Synnax cluster.
This page walks you through the lifecycle of a read operation. If you’d like a practical guide on reading data using a client library, take a look at the respective pages for Python and TypeScript.
What is a Range?
A range (short for “time range”) is simply an interesting region of a channel’s data.
To read from a cluster, you must provide two pieces of information:
- A range marked by a start and end timestamp.
- A list of channels you would like to read from.
Using these parameters, the cluster will resolve the relevant domains for each channel and return the requested data.
Range Inclusivity
It’s important to note that ranges are start inclusive and end exclusive. For
example, if our desired range is from 1677433720970863400
to 1677433721870863400
, we
would retrieve the following subset from the shown domain:
Iterators
Underneath every read operation is an iterator. An iterator allows the caller to traverse through a range of data in a streaming fashion. Iterators are more complex to work with, so we recommend using a single request-response read when possible.
Just like a read, an iterator can be created by providing a range and a list of channels. Creating an iterator will not return any data, but will instead open a persistent connection to the cluster. You can then perform two categories of operation: seeking and reading.
Validity
Iterator’s maintain a validity state throughout their lifetime. This flag is used to indicate whether the iterator is healthy and has more data to read. An iterator is considered invalid if it:
- Has accumulated an error. This typically happens when the iterator is unable to reach the cluster.
- Is not pointing at a valid sample. This can occur if the iterator has:
- Exhausted its data, meaning the end of the range is reached during forward iteration, or the start of the range is reached during reverse iteration.
- Has not been positioned yet with a seeking call.
An iterator that has been opened but not yet positioned is invalid. To position the iterator, you must call a seeking operation.
Seeking
Seeking moves the iterator to a new position in the range. All seeking calls return a
boolean
indicating the validity state after executing the operation.
Operation | Arguments | Description |
---|---|---|
seek-lt | timestamp | Seeks to the first sample whose timestamp is strictly less than the provided timestamp. |
seek-ge | timestamp | Seeks to the first sample whose timestamp is greater than or equal to the provided timestamp. |
seek-first | None | Seeks to the first sample in the range. |
seek-last | None | Seeks to the last sample in the range. |
Seeking calls can be used to revalidate an iterator after it has been exhausted or positioned to an invalid location. In the case of an accumulated error, this call may or may not succeed.
Reading
There are two methods for reading from an iterator. The first is through a fixed number
of samples called the chunk size, which can be set when creating the iterator. Each
call to next
or prev
without any arguments will return the next chunk of data.
The second is by providing a specified timespan to read. This is useful for seeking to and reading specific sections of data. When using span-based iteration, you should be wary of reading too large of a span, as this can cause heavy performance degradation. Reading by a span is start inclusive and end exclusive, regardless of the direction of iteration.
As with seeking operations, all reads return a boolean
indicating the validity state
after executing the operation.
Operation | Arguments | Description |
---|---|---|
next | timespan or nothing | If no timespan is provided, reads the next frame of data specified by the chunk size. If a timespan is provided, reads the next frame of data across the span. |
prev | timespan or nothng | Reads the previous chunk of data whose timespan is less than or equal to the provided timespan. If no timespan is provided, reads the previous chunk of data specified by the chunk size. |
Accessing the Iterator Value
While read operations do fetch frames from the cluster, they do not return them
directly. Instead, the current frame is kept in client-side memory, and can be accessed
through the value
method on the iterator.
This method returns a frame with the same format as in unary reads. If the iterator is
invalid, calls to value
have undefined behavior. If the iterator has been positioned
but not yet read from, the frame will be empty.