Concept: analogous to SQL TRANSACTIONS (BEGIN TRANSACTION …. END TRANSACTION). On SQL it involves various tables, on on Saga it involves various services.
Saga “step”: transaction local to a service.
Between those “saga steps”, you need to know what to do next. Then there are 2 techniques/patterns to do that:
Sagas should take milisseconds.
CQRS: maintain replica of your events data’ that can easily be queried, doing the necessary transformations to build this replica optimized for your querying needs. The replica can be updated by subscribing to events that are being emitted by the services that own the data. You could use NoSQL databases (mongodb, elasticsearch for text searches, postgres with JSON fields…). You should be able to rebuild the replica from scratch.
Services must atomically update state and send messages. For that, you can use event sourcing implies an “event store”: must store timestamp, entity id/type, event id/type/data. Kafka can be an event store but it DOES NOT ALLOW YOU TO RETRIEVE EVENTS BY ID - YOU MUST HAVE THAT CAPABILITY!!! It is an excelent message broker (its main advantage is that it can guarantee order in topics), you should not count on it to store the events due to not having this capability. Kafka should be used JUST AS A MESSAGE BROKER. As soon as the broker receives an event, you should persist this event on the event store. That way you can use CQRS as is mentioned here.