I suppose in some ways I can appeal to graph theory to provide a conceptual basis for a design, but that is a little way off (and scribbled on bits of paper and 'badly drawn' Visio diagrams).
So I've settled initially on at least defining some concrete base requirements:
- Self discovering: multicast discovery of reachable nodes, preferably driven by convention rather than configuration
- Fail over of 'broken' nodes: core requirement
- Fail back support: core (requirement)
- Elections and voting: Nodes may participate in elections, voting a 'prime' node in or out, perhaps based on their performance in terms of correctness
- Adjudication: Ability of nodes to create a 'committee' to adjudicate on a tie, assuming that the number of nodes is not odd
- Dead heat support: When dealing with shared resources, deadlock prevention
- Graceful removal: Nodes need to be able to retire gracefully (before potentially being re-animated)
- Split brain detection: An infamous scenario; many possible solutions exist, each with their own advantages and disadvantages
- Roving agents: A bit of a '90s throwback this; agents to 'move' among nodes, sampling or perhaps providing support where needed
- Message repository: core
- System messages versus application messages: Differentiation between system level (quorum administration messages) and higher level, implementation messages
- Systems operation support: integration into SNMP/SCOM and so on
- HTML 5 graphical manager: A GUI for quorum management/query/analysis
- Priority support: Allowing priorities to drive reaction to and treatment of messages (of any type)
- Implementation: patterns and best practice based
I've also arrived at a rather limp name for the endeavour:
Software quorum, adjudications, tallying and elections
As an acronym therefore: SOQRATES.
Groan.
2 comments:
Nice Post. This post helped me in my university assignment. Thanks Alot
You're welcome!
Post a Comment