------------------------------------------------------------------------------
APPROXIMATION and QoS PANEL
------------------------------------------------------------------------------

Coordinator: 
    Mitch Cherniack
Participants: 
    Dave Maier, Rajeev Motwani, Johannes Gehrke, Hari Balakrishnan

------------------------------------------------------------------------------

MITCH CHERNIACK

Mitch emphasized three talking points:
1. to justify if QoS is appropriate for streams
   At CIDR, QoS came up also in Telegraph (Franklin: as "graceful degradation")
2. if answer to 1 is "yes", how to specify QoS for streams
3. how to process QoS specifications

Definition of QoS: 
    - example: multimedia 
      -> error tolerance of jitter
      -> x: error metric 
         y: mert factor (i.e., how happy you are with a result, 
                               ranges between 0-1)
    - traditional network applications
      -> admission control
      -> resource reservation
      -> load shedding
    - for real-time applications, best effort is not enough

Mitch's answer to point 1:
    - 2 characteristics:
      -> resource management 
      -> for different apps, different requirements
    - examples for different apps, different tolerance:
      -> remote triage - every two seconds
      -> personnel info - low tolerance for latency
      -> where enemy is - high accuracy, can wait longer
    Sirish: Why not use more machines? The real issue is unpredictability
            and variability..
    Brian: Is missile attack equivalent to remote triage? 
           There are separate ways of doing things in separate systems..
    Sirish: When there is sharing, QoS is harder..
    Mitch: If there is separate machine for each query, this is hard-coded
           resource allocation..
           We looked at different places where we can use QoS in Aurora.
    Buchmann: QoS of message delivery? In distributed systems, there are
              different semantics like "at most once", "at least once".. 
    Mitch: Errors and tolerance depend on the application.
    Buchmann: QoS not on query processor level, but other levels of
              messaging/networking

Answer to points 2 and 3:
    - differences between stream network and packet networks justify
      1. Output tuples is not a subset of input tuples
         not just routing, but producing new ones - makes a difference
      2. Semantic richness
         can take advantage about knowing about tuples
    - for the first difference:
      -> In packet networks, packets either make it there or be dropped.
         Inaccuracy = dropped/total  anywhere within the network.
      -> In stream networks, tuple is lost when SELECT_p. It depends on
         if p(t) or ~p(t). If a tuple is lost before join, several tuples
         may be lost from the output. There is a non-linearity. This is 
         different from packet networks.
      -> It is hard to determine the impact of dropped tuples on the
         output. It highly depends on where dropped.
    - for semantic richness:
      -> Some tuples are more important. In Aurora, this affects:
         load shedding - dropping
         scheduling - reordering
    Alin: How do you come up with graphs?
    Mitch: It is hard. Characterizing abnormal ones is important.
    Alin: In join, values in one stream depending on values in other in
          terms of importance..
    Mitch: QoS attached to applications, not boxes..
    Alin: Propagating QoS?
    Mitch: Processing QoS is not easy. Dropping randomly vs. predictive
           analysis that maximizes QoS
    Mehul: When is it impossible to meet QoS? Any analysis?
    Mitch: Assumption is that every QoS is feasible..
    Stan: Depends on rates and resources..
    Johannes: Sampling over joins..
    Carlo: Several applications?
    Mitch: In an app, stale tuples in queue, latency requirements.. It may
           be better to drop them..

------------------------------------------------------------------------------

DAVE MAIER

Multimedia QoS
    - streaming media, streaming data rather than data streams
    - lessons learned
      -> non-linearity between doing random or thoughtful things
      -> error model differs
      -> control theory - control parameters, to avoid oscillations
                          hard to apply since it is a discontinuous system
      -> hardest is instrumentation for watching it to tuning it; make sure
         it exists
    - model might apply to streams, not guaranteed

Ways to explain error
    - quantization - cut down on resolution
    - quantization and drift are worse than shift and lag

Error model
    - completeness: can you express an error with it?
    - more than one interpretation
      -> how they compose
      -> how to deal depends on app
    
Use of error model
    - don't want to measure quality of error
    - instead, precompute effect of different load shedding options and
      switch between them based on the current resources

------------------------------------------------------------------------------

RAJEEV MOTWANI

Disclaimer
    - common sense info
    - no QoS in STREAM system

Should we have QoS in stream systems? Sure, why not?

Start with and example
    - IT organization
    - network devices, servers, apps
    - each generating streams at:
      level of SNMP,
      packets,
      syslogs
    - put all into a centralized management system to do
      -> security
      -> why something crashed
      -> if peopleSoft is down, why?
      -> continuous queries and ad-hoc analysis

Characteristics of such apps/killer apps:
    - variable bursty load
    - large number of continuous queries
    - real-time, online
    - event-driven 
      -> example: if an intrusion detected, then do totally different things
      -> QoS becomes important
      -> change resource allocation

QoS guarantees
    - load shedding
    - prioritization
    - degree of load shedding
    - react to run-time events

QoS must be simple and effective.

Mechanisms
    - sampling
      -> problem: effective for simple queries; effect on complex queries
                  is  hard
    - alertness
      -> boolean output (false positives and false negatives)
      -> haystack problem
      -> security alert - may miss important events
    - approximation
      -> mechanisms in general approximate processing - not much progress
         in even the general context
      -> simple queries are ok.
      Jennifer: Input sampling is bad for interesting things like Join
                Load shedding may eliminate everything - eliminate needle in
                haystack
      Dennis: Can use a fact table..
      Alin: Foreign key joins..
      Jennifer: They are the same..
      Rajeev: Situation is different for streams, not as in the fact tables
              case..
      Magda: There are multiple techniques. Based on app, we can use an
             appropriate technique..
      Rajeev: We don't have a metric to decide on that..
    - focusing 
      -> value-based load shedding
      -> focus on certain domain values
      Jennifer: example: incoming missile, squirrels
      Johannes: gold customers
    - skipping
      -> dropping output?
      -> sampling at a different level 
      -> accumulating and processing later
    - jumping windows
    - shrinking windows
    - latency
      -> does not shed load, postpones it
      -> for burstiness?
      -> batching for more throughput
    - level at which apply load shedding
      -> input/output/subquery

Expressing QoS
    1. What form is acceptable? Where to do?
    2. Weights, partial order, run-time, compile-time..
    3. penalty, utility
       -> Aurora - utility
       -> rate chart - app specifies based on rate
       -> threshold
          These are critical queries, here is the max for them.
          For the rest, do the best you can..

Sample Models
    - Aurora

Delivering
    - MQO is a problem, constrains it..

------------------------------------------------------------------------------

JOHANNES GEHRKE

Why? How? What?

Competing axes 
    - combining them is hard for user 
    - machine learning techniques may be needed

Two models
    - fast CPU - not enough MM
    - slow CPU - can't keep up

Control QoS by how much space you give
    Stan: This is not reactive. You pre-compute. Not for spikes..
    Johannes: Can't do the adjustment quickly for spikes..

Sketches:
    - not very good method for stream systems

Set-valued Queries
    - use synopsis?
    - subset approximation - which subset?
    - element frequencies, values
    - metric from Stanford (EMD)
    - max subset ~ symmetric set difference

Sampling
    - constructing sets from samples

Histograms
    - computation in histogram space
    - problem: high-dimensional histograms, joins

Semantic Load Shedding
    - minimum number of edges deleted (in join)
    - window-based?
    - hard problems..

Open problems
    - classes of metrics
    - interface to specify

Two mini points
    - composing techniques

Multiple Queries
    - space all sketches - care needed
    - hardest: specify and feedback to user
    - right metrics and how to state
    Question: It is hard - consequences to the result, effect on metric..
              Due to this, some techniques might be more desirable.

------------------------------------------------------------------------------

HARI BALAKRISHNAN

QoS-first, QoS-later, or QoS-never?
    - networking QoS
    - why it didn't work on Internet before and working now?

Design QoS such that it is practical and can be made money (so that it is
used)

Rajeev: SLA or QoS?
Hari: SLA don't have QoS, but starting to have rate specification. There is
      focus on delay.
      Handling overload - what to drop
      ISP - what to accept, what not; for protection purposes today..

Sharing resources
    - macrostreams
      gold -> queue1
      silver -> queue2
      who maps?

Work-conserving scheme
    - scheduler not idle if there's a job waiting
    - same mean delay (per queue delay, not per packet delay)
    - tradeoff delay between different streams

WFQ (Weighted Fair Queue)
    - bandwidth or rate sharing guarantee
    - bounds on worst case delay
    - bounded delay on streams -> use WFQ
    - makes bursts look like linear process
    Franklin: In broadcasting environment, it is not clear if it works..

SQ FIFO
    - everybody on same queue
    - minimizes jitter
    - preserves burst - may be important for applications
    - WFQ x SQ FIFO - pushing in opposite directions
    Dennis: What is jitter?
    Hari: burst of packets coming; mean linear lineation of packets..

QoS models
    - relative: who is Gold/Silver changes in time..
    - end-to-end or per-hop
      -> end-to-end harder - optimize app performances
      -> per-hop - where money is

Utility functions
    - hard to express except two cases

Economic principles
    - what brings benefit to providers, not to apps

Inverse of Aurora utility functions
    - elastic app 
      -> like file transfer (waited RR is good)
    - drop-dead 
      -> have to respond in a certain time
      -> need adequate provisioning
      -> no real way to bound the number of missiles (can't solve the problem)
    - elastic
      -> concave easier
      -> concave then convex harder (as in MM apps)
      -> convex portion - maximizing slope
    Rajeev: aggregate utility? stream nets vs packet nets?
    Hari: utility x degree of commonality - one more dimension

Medusa

Claims
    - difference from networks
      -> joint management of multiple resources
      -> simple service model needed
      -> should be practical
      -> feedback to sources (not in conventional databases)

------------------------------------------------------------------------------