摘要:
An event processing system is disclosed that processes events of an event stream, performs the recovery of events during system failure and preserves the state of the system reliably and accurately while achieving desired system performance. In an embodiment, the event processing system processes a first batch of events of a continuous input stream of events using a continuous query and generates an output stream of events related to an application. The event processing system identifies one or more operators of the continuous query and determines that an operator is a journaled operator. The event processing system generates a journaled snapshot of a current state of the system based on execution of the journaled operator on at least the first batch of events and stores the journaled snapshot of the current state of the system.
摘要:
Some event ordering requirements can be determined based on continuous event processing queries. Other event ordering requirements can be determined based on distribution flow types being used to distribute events from event streams to node executing the queries. Events from event streams can be ordered according to ordering semantics that are based on a combination of all of these event ordering requirements. Additionally, virtual computing nodes can be associated with constraints, and computing processors can be associated with capabilities. Virtual computing nodes for processing event streams can be assigned to execute on various computing processors based on both these constraints and capabilities. Additionally, for each of several events in an event stream, a ratio between a total latency and a communication latency can be for determined. Based on an average of these ratios, a quantity of reducing nodes that will be involved in a map-reduce operation can be selected.
摘要:
Techniques for managing value-based windows on relations are provided. In some examples, an input relation is generated. The input relation is a bounded set of data records related to an application. A continuous query that identifies the input relation may be received. Additionally, a configurable window operator associated with processing the input relation may be identified. Then, the continuous query may be executed based at least in part on the configurable window operator to generate an output relation. Further, in some instances, the data records of the output relation may be provided based at least in part on execution of the continuous query.
摘要:
This disclosure relates to a system comprising: a memory storing a plurality of instructions; and one or more processors configured to access the memory and execute the plurality of instructions to at least: receive a continuous input stream of events related to an application; determine an interval for inserting a checkpoint marker event into the stream of events, wherein the size of the interval is determined based at least in part on a type of the application, a latency requirement of the application, and a frequency at which events of the input stream of events are received; process the continuous input stream of events to generate an output stream of events related to the application, the processing comprising inserting the checkpoint marker event into the continuous input stream to create an event batch, and the event batch including each event of the continuous input stream of events received during the determined interval; determine an output sequence number for an output event in the output stream of events; transmit the output event in the output stream of events; store the output sequence number of the output event; while the continuous input stream of events is being processed, receive an indication of failure of the system; determine a current output sequence number of a most recently transmitted output event in the output stream of events; determine a most recently processed event batch; in response to the indication of failure of the system, re-process the events in the most recently processed event batch and determine a set of one or more output events of the output stream to be transmitted based on the current output sequence number and the most recently processed event batch, the set of one or more output events to be transmitted comprising each of the re-processed events having an output sequence number greater than the output sequence number of the most recently processed event; and transmit the set of one or more output events related to the application.
摘要:
Systems and methods for guaranteeing the event order for multi-stage processing in distributed systems are disclosed. In some examples, a warm-up period is used to accurately determine a starting point for ordered events of an event stream. Skip-beats may be utilized as dummy events so that the event processor does not wait too long for events that were filtered out at earlier stages.
摘要:
Techniques for performing non-event pattern matching on continuous event streams using variable duration. The duration value used in non-event pattern matching can be variable. Accordingly, a first pattern match candidate can have a different associated duration from a second pattern match candidate for matches arising from events received via an event stream. In certain embodiments, the duration for a candidate pattern match may be based upon one or more attributes of an event that started the candidate pattern match or based upon an expression (e.g., an arithmetic expression) involving one or more attributes of the event.