Punctuation
in TelegraphCQ
TelegraphCQ supports basic punctuation tuples. A punctuation tuple is a
tuple with just a timestamp attribute and no data. TelegraphCQ inserts
punctuation tuples into streams to indicate that no tuples with
timestamps before than the indicated value will occur in the stream.
Implementation of Punctuation
Punctuation tuples are implemented as HeapTuples
with every field Additionally, TelegraphCQ sets the bit HEAP_ISPUNCT
in a punctuation tuple's header. Code
can use the HeapTupleIsPunct()
macro to
check whether a tuple is a punctuation tuple.
TelegraphCQ uses the function form_punct_tup()
to create punctuation tuples.
Punctuation Producers
Portions of TelegraphCQ that generate punctuation tuples are:
- The CSV wrapper: If
the
CSV wrapper is generating its own timestamps, it will send periodic
punctuation tuples during quiet periods. Currently, the wrapper sends a
punctuation tuple whenever it goes for a second without receiving any
tuples.
- The Fjord Aggregate node: This
operator, which takes care of windowed aggregates, places a punctuation
tuple after every group of aggregate tuples. The node timestamps the
tuple with the next point in
time at which it will output aggregate results, indicating that there
will be no results until then. This punctuation should allow a query
that uses an aggregate subquery to put a time window over that
aggregate.
Punctuation Consumers
We modified several operators to handle punctuation properly:
- The Agg node: This node, which handles GROUP BY and aggregation, consumes punctuation
tuples without sending them through its tree of operators.
- The GSFilter node:
This node, which handles most selections, passes punctuation tuples
through to all queries.
- The Filter node:
This node, which handles generic selection predicates for a single
stream, always passes through punctuation tuples.
- The HeapTuple code: The
function DataFill(), which is used to fill
in the t_data fields of copied HeapTuples,
now copies over the HEAP_ISPUNCT flag, along
with the other flags.
- Projection functions: ExecTargetList(), which performs projection by
creating a copy of the input tuple with just the fields being kept, now
copies the HEAP_ISPUNCT flag into the new
tuple.
- Data Triage: The code
that constructs summaries of kept and triaged tuples (in particular the
function tupbuf_insert()) ignores
punctuation.
- The FSTeM Module: Work in progress; when an FSTeM
receives a punctuation tuple as a probe, it should append a HeapTuple
of NULLs to the punctuation tuple and return the resulting
IntermediateHeapTuple to the Eddy.