--------------------------------
Notes from the Engine Panel 
--------------------------------

Goals  (stated by MF)
________________________

- Identify key areas where DB tech. falls short
- Give succinct justification for this area
- Identify research areas
- Road map
- Point out possible solutions, non-solutions 


- CIDR paper - Dieter Gawlick and co., Oracle
  
  Does a lot of the fancy stuff that we think cannot be done in
  traditional systems. What gives ?

- Mike Franklin

  - Sharing and Adaptivity really closely related

    - Sharing fundamental .. benefit: scalability

      - With standing queries we have all queries, unlike traditional MQO
      - Also queries are long lived 
      - need to avoid duplicate work and keep with dataflow rate 
      - need to reduce query entry cost 

    - Adaptivity

      - How dynamic is the environment really ?
         - statistics, query mix, state fluctuate 

      - CSE - traditional MQO
	- Static .. over time (w/ query insertion/deletion) they degrade 
            the plan
       
      - Approaches: YFilter and TelegraphCQ
	- YFilter
	  - NFA fragment for XPath expressions
	  - Combine NFA fragments -> 
	    - create a single state machine to process 'em ==> shared 
               (machine) state and processing 
	    - A trie over Xpath expressions 
          - Can create one big uber-query .. doable even for XML
	    - Adding queries is still easy
	  - TelegraphCQ 
	    - Eddies and SteMs
	      - Purest form of eddy, routing on a per-tuple basis
	      - Adding new queries are easy, no need to recompile
	      - Just add to the eddy, routing decisions take care of it
	      - Overtime routing adjusts to accommodate the new query  
              - Makes it much easier to do sharing .. in a dynamic environment
             
       - What's the degree of dynamism targeting ? (Hari)
	 - Input changing is the classical idea of Eddy benefit
	 - However queries might change
	   - Ability to add queries over time and still don't degrade
	   - What's the beef ? How many queries, changing how often ?(BrianB/JW)
	   - Big problem (Stonebraker)
	     - Oracle guys do a scalable trigger system .. that's going to solve
	     the standing query problem.
	     - Overhead of routing a tuple through an eddy is comparable to the
	     cost of processing it through a join or a filter. Cost of an eddy
	     is a factor of 2. Adaptivity is going to cost a 100%.
	       - Wake up every 500ms and change the routing policy. 
               - Batch tuples through eddy to reduce routing cost 
	       - Same as what Oracle does (or Oracle could do this too)
             -  Mike Franklin's response
		- Only a factor of 2 is great
		- Query optimization gets you a lot of benefit
		- Still don't know how dynamic things really are of course
	     - Alex B.
	       - Not sure things are really that dynamic
	       - Use streaming queries to complex events that are going to trigger some
	       action. There are a finite set of queries in some application. Lots of
	       instances of these, but rare that you are going to get more and more of
	       them randomly dynamically.
	       - MF response:
		 - EBay example: millions of people coming in .. 
		   - MS: only one query ..
		   - Well the predicates vary and have different selectivities (Sam)
	      - CZ 	 
		- Which killer-apps have this property ?
		  - Perhaps not a lot of such applications
              - AA
		- User-defined operators .. don't know how they commute ?
                   (Benefits of Eddy primarily with commutative operators?)
	      - MS
		- 100K standing queries .. we won't use something like Eddies
		  - Will use an indexing structure - Forgy's RETE n/w paper 
		    - GroupedSelectionFilter are the TelegraphCQ response

- Alex Buchman:
  - DREAM (Distributed Reliable Event-based Application Management)
  - Stepping outside the db box 
  - Use some db techniques, but more about tweaking db techniques
  - Applications
    - ATIS: Air Traffic Info Systems
      - New generation being developed in Europe currently
      - Prototype on the gate-to-runway overflow
    - Internet enabled cars
      - Traffic management
        - Want to do individualized routing
	- Don't want everybody to head to a new direction 
	  - That's what happens when it gets broadcast in the radio
	  - People take a detour and that causes a fresh traffic jam
      - Sensor data from a car (aggregated locally)
        - Failure data are piped out for analysis by a manufacturer
	- Failure data can also be used to trigger workflows
	- Setup appointments in a repair shop etc.
      - Personalization of a car
	- Take your car personalizations with you .. a car portal
	- When you rent a car you get to take your settings with you
        - Car and you both have preferences
	- You can store the personalizations in a PDA .. privacy issues

   - Different Event Generators
   - Streams of events that have to be pushed to consumers
     - Brokers .. there are more than one broker
       - Eg: ATC - more than one ATC center
    - Self-stabilizing pub/sub
      - Two correctness conditions: liveness and safety
      - When there are random failures, things should eventually be delivered. 
        Done through leases. Must resubscribe from time to time (i.e.,
           renew the lease periodically)
    - System has 2 parts
      - Pub/Sub system
	- Loosely couple a bunch of clients
	- Add new consumers easily .. subscribe to interesting notifications
	- Messages are routed through a notification service.
	  - Lot of hairy stuff happens in the notification service. 
	  - Lots of simple filters .. not a lot of complex aggregation
      - Reactive functionality
	- Event-Condition-Action style rules
	- Indivi. services .. that can be composed as needed
            - event detection
            - notification 
            - conditional evaluation 
            - purging 
            - action execution 
            - logging 
	- Need a mechanism for composing individual events
	- Just a trigger for some reaction that must be generated
	- Event composition can be done through streaming queries .. but you
	  don't really need to do it through streaming queries
	  - Can be done with an event algebra
	  - Can be compiled into an event graph ..
	  - When a complex event gets created in the graph it pops out
	  - Sharma Chakravarthy's consumption modes .. part of SNOOP
	    - They use the same for their version 
          - With sensors, order and correlate are important
	  - Queries are really run in clients ..
	  - Very simple clients are those that generate events like sensors
	  - Complex clients are those nodes that do complex stuff
	  - How to reduce the number of events pushed through a pub-sub system	
	    - So that you don't have to do flooding
          - Magdalena B.: Does the system take into account that clients 
             have limited b/w?
            - AB: intelligent processing in n/w ....
       - Problem sources
	 - Scale - filter placement ..
	   - Roaming and resuming. Scopes for security
           - dynamic subscriptions 
         - Distribution - a source of problems for time and synchronizations
	 - Heterogeneity - with heterogeneous events how to make them compatible.
	   - The old s-word
	 - Less control in these systems 
             - QoS in loosely coupled systems?
	 - Addressing stuff in a pub-sub system
	   - Tibco does it on the subject keyword .. very strict. Rigid 
               subject hierarchy. 
	   - JMS: (topic) content based model .. simple model homogeneous
	     - Content based addressing
             - simple predicates 
          - Problems with above: can do only simple predicates, and assume 
                homogeneous environment  
	   - One level deeper
	     - Concept-based addressing
	       - If concept-identifiers match, the match is easy
	       - If not, we need appropriate conversion functions defined
	         with a context. Relatively efficient because it gets reduced
		 to a simple context.
               - user-defined context with ontology (MIX) 
                 - 3 levels 
                   - representation 
                   - infrastructure 
                   - domain spec. 
               - event notification carries context ID. An example w/ 
                 NYSE and Frankfurt SE. 
               - Use of main memory reliable cache (replicate stuff 
                   in the main memory of other nodes, instead of disk?)
	  - MikeF:
	    - Is heterogeneity problem different from data integration env. ?
	      - Not particularly, but gotta resolve it on the fly
	      - So you gotta build it in the infrastructure

- Ugur Cetentimel
  - Overload management in stream processing engines
  - Aurora' main concern is how to do overload management
    - Different in push-based systems and stream processing in general
  - Aurora
    - Scheduler is the brain of the system
    - Router gets tuples either from outside of the network or from boxes
      as the result of processing
    - Scheduler and load-shedder are the main components
      - "Control is good" despite what Brewer said at CIDR
    - Push-based systems
      - Things pushed asynchronously
      - Gotta react to things .. including dealing when load is beyond the
        capacity of the system
      - Here we need Graceful Degradation.
	- Without overload management, throughput is going to decrease and then the
	response time is going to reduce. 
    - What is the right execution model
      - Thread-driven execution model
	- Each task (can be a query or an operator) is assigned a thread
	- OS manages a thread
	- This is not scalable because of scheduling overhead
	- Destroys cache locality, TLB misses
	- Resource mgmt decisions made by OS in general
	- Gain the control back from the OS
    - Our execution model
      - Tuple-driven model
	- Small number of threads (one per CPU)
	- Going to loop continuously
	- Make scheduling decisions 
	- Full control means we can keep tuples from entering the system and clogging the
	network.
	  - Difference between thread-based and "message-based"
	  - Do you lose control ? What if you have sharing ? So different tuples for
	  different queries passing through it ? In the thread-driven model you have the
	  view of a single query. In a tuple-driven model when you merge operators this
	  view gets diluted ?
        - Challenge .. how to do scheduling and how to do it in a light-weight fashion. 
	  - Sharing makes things more interesting. 
	- Scheduling must be made light-weight
     - Load-shedding for overload management
       - Drop tuples from the system .. where to place drop boxes etc.
       - Dynamic approaches may be too expensive .. esp. when system is overloaded
       - Dave Maier: where to place drop boxes ? edge of the network ?
	 - Don't want to drop tuples after they enter the system and you have committed
	 resources for it .. but sharing complicates things. Drop it right after a split
	 in one branch. 
	 - Assumption from UC: we don't have control over sources
	 - DM: it's a bad assumption
         - Others mentioned that in some cases it might be better to 
            take a quick look at tuples before dropping them to decide 
            which ones to drop (as opposed to dropping them without 
            any processing)
	 - MF: Over configure the system to handle the peak load in a classical
	 system. Why can't you do this in such a system. UC response: cannot size your
	 system based on your worst case system.

- Ted Johnson (Gigascope)
  - How to process 1M packets/sec and run a non-trivial query
    - Look at the queries, analyze them and push 'em down as far into the system as
    possible.
    - Gigascope is a stream only model .. similar to boxes-and-arrows
      - Why is this different (JW) ? 
    - Router is a high-volume data source .. a stream
    - Use a costly mirror to reproduce the packets passing through the router
      - The mirror is a select stream
    - Mirror is attached to a network interface card (NIC) 
      - What kind of operation can be pushed down into the mirror (select stream)
    	- E.g., selection and projection
	- The selections are pushed down by setting registers in the select-stream
	  - Like a db operator
	  - Pushed a select/project/agg in a NIC 
	  - Difft cards have different capabilities
	    - Snap create .. 134 bytes (lets you do projections)
	    - bitmasking .. let you do approximate selections
    - The NIC flows data into a circular buffer .. when you overload we drop tuples
      - Sometimes we can do BPF (Berkeley packet filter) .. can get locality to push
      projection operations into the kernel
    - From the buffer it goes into the box of low-level queries (no data 
       copying happens here) which is the packet filter layer
      - Johannes - is it a push-pull interface ? Queries pulling, data pushing ?
	- Queries pulling from the circular buffer
	- Queries running in user-space.  
      - Queries can run selection,projection,agg etc.
      - Aggregates they use a direct mapped cache
    - First copy is from the NIC into the buffer. Next copy if from the 
       lower level query layer to the higher level one. 
    - How it works
      - Take the query set (Q1,.. Qn)
      - Extract stuff from it (q1,q2 ... qm)
      - Push them to the BPF-below .. gotta do some MQO
      - Push Bloom filters down so that approximate filtering can be done
      - need to drop tuples early whenever possible 
      - low-level queries are just scheduled with the OS
      - If there are sampling filters (state determining) got to be extremely careful in
      commuting 'em.

- Jennifer Widom (STREAM)

  - Issues in building the system
    - Maintaining intermediate state .. memory or disk ?
      - Is there an accuracy-latency-storage tradeoff ?
      - AB .. an intermediate way 
	- A reliable main memory cache .. sending things off to another
	machine for main memory caching might be faster than
        local disk caching 
	- Asynchronously writing to disk
	- Less latency with very high communication 
      - When we run out of memory we approximate
      - Stonebraker: Write to virtual memory
     - What is the objective of a scheduler
       - What are the performance metrics for continuous queries ? 
     - How important is approximation ?
       - Are streams really going to be all that fast 
	 - Can't the filters be pushed to source ?
         - Dennis Sasha: Filtering might be possible only after fusion. 
     - How adaptive do we need to be ?
       - Are they really going to vary that much ?
       - Eddies one extreme .. where do you want to be in the spectrum
       - Military envt. more varying than network monitoring .. 
       - Adapting adaptivity 
     - API issues ??
----------------------------------------------------------------------------------