Ethos ART Story point standard used by Scrum Teams

In Ethos agile release train a user story is sized using story points 1,2,3,5,8. Reference: https://www.scaledagileframework.com/iteration-planning/

Story point sizing

Ethos Agile Teams use story points to relatively estimate stories. With relative estimating, the size (expected effort) for each story is compared to other stories. Make sure that your story point will include development, testing and validating effort of the scrum team. The team’s velocity is equal to the historical average of all the stories completed per iteration. Velocity is the starting point for calculating a team’s capacity for a future iteration. Velocity is also used to estimate how long it takes to deliver Features or Epics, which are also forecasted in story points. 

Ethos Scrum Team Estimation Standard 

   Ideal Days

Story  points

Note

.5  to 1.5  

1

1 to 3

2

2 to 4

3

4 to 6

5

7 to 9  

8

Some points to remember

  • You can do early working and testing on smaller user stories (Keep prioritization also in mind)
  • Team to do smaller stories at the same time
  • Assigning  only one single user story with 8 points to person is not recommended. As it will take a whole sprint to complete that.
  • Scrum masters are required to educate your Scrum Teams
  • All user story points should include testing and validation effort along with development work. ( It should have the work effort required to complete that particular user story)
  • When you size a user story make sure that all the team members are  providing the points simultaneously including Dev, testing and validation work. 
  • Please discuss  and ask for justification , if you think that the  estimate story points are more or less.

Note: SAFe Feature = Jira Epic for Ethos Scrum Team

 ICON Agility Services Suggestion 

The following standards are suggested by ICON Agility Services.   These were initially introduced via conversations with Scrum Masters and Product Owners, as part of the onramp timeline towards April 2020 PIPE.  The PowerPoint deck can be found here.The Video stream for the Scrum Masters conversation can be found here.  Of course, we should use local context and discretion within each PDev Team, remembering that ultimately the Team is committing to holistic delivery of each Story loaded to the Program Increment plan.   Commitments represent Team capacity, not individuals’ capacity.  There are common anti-patterns that an estimation standard resolves.   For example, focusing on the efficiency of individuals by function, rather than the effectiveness of the Team, which relies on timely availability of individual Team members, is a common anti-pattern which erodes reliability of the entire Vertical planning system.Best as possible, the Program Increment (PI) Planning Event (PE) Iteration timings and load will be honored, however, we should factor in what we’ve learned and discovered in earlier sprints, as part of final estimation during the final planning of each Sprint.

Types of Estimates used in a PI Planning Event (PIPE)

  • T-shirt sizes
    • Small:  representing the lower range of the Relative pointing standard
    • Medium:  representing the middle range of the Relative pointing standard
    • Large: representing the top end of the Relative pointing standard, and a higher degree of uncertainty of over running the Iteration 
  • Relative Points
    • Using the Fibonacci Sequence, we base the size of each story relative to some norming standard set by each Team over a period of time.  Typically the reference point is the Smallest story.   Sometimes the mid sized story is the relative base reference.    Consistency within ‘a Team’ is key.
    • Scale:   1,2,3,5,8
      • Presuming a 2 week Iteration for the Sprint interval, this implies approximately 9 working days (10 business days – 1 day for aggregate time invested in all Iteration Ceremonies).
      • Refer to the Estimation Calibration standard and observe that an 8 point story implies 7 to 9 Ideal Days of time, start to finish, to delivery that complete story.
      • Stories > 8 relative points will not fit within one Iteration, and must be decomposed further.   Therefore we suggest capping estimates at 8.

When to use:  T-shirts are a top-down convention while attempting to clarify the “ask” and the “solution approach/options” while Drafting the agile artifact.   Relative points are used as part of Finalizing the Agile artifact, bringing more precision and stability to the timings of Starting and Finish (Calibration) that unit of work within a layer of the Agile structure.   These relative points are aggregated back up to the layer above where a Correlated estimate can be honed.

Calibration and Correlation

As part of Scaling Agile Practices, its important that we establish a predictable, near-term (6 to 9 months out) Forecasting model.    GoToMarket coordination and related activities rely on this, as well as Capital Budgeting.   Please reference the links at the top of this page for more rationale.Calibration is a scaling standard that stabilizes the way a specific Team estimates over time.   Do not confuse this with normalization and the starting guidance from SAFe.   As referenced in the April discussion, if we were to assign the same exact story to an experienced team, versus assigning to a new (to industry, FIS, etc) team, we’d expect a different estimate.   It’s crucial that we can assess the affects of certain assignments to certain teams.   To reduce Cost of Delay, we must enable the Planning System to defer Team assignments to the last responsible moment.   The predictability this standard enables is crucial to this goal.Presuming a Team will deliver on the Value Intent of their Sprint commitments (the caveat of Scaling Agile), we next optimize on the estimated arrival timing (ETA), to assist with Integration Testing and further assurance that Stories (and testing defects) will not escape their Iteration, and likewise, PI determined ETAs for each Feature arrive within reason according to those ETAs.

Correlation is a scaling standard that enables a Team to compare it’s estimate to the aggregate estimates of the delivery at the next level below them (those delivering on their guidance).   This can be thought of as “Calibration across the layers of the Planning system”.   Correlation enables Teams at higher levels to learn frequently and early from the estimates rolling up from below.   Crucial conversations can be held to determine if the Value Intent and Solution “Guard Rails” are well understood.   If not understood, scope can be contained early.   If lower level estimates did reflect the guard rails of Intent, then the higher level Team learns how to improve their estimation approach and resets their expectations.Calibration enables Correlation.    Both bring a reliable Forecasting model by which to govern using intentionally less precise Agile@Scale In-take practices, which favor direct witnessing of an incrementally evolving Working Solution which can be validated by Customers and a quickly closed Feedback Loop (see the Chutes and Ladders presentation and video; to be added ~4/20/20).

Observing that the ultimate commit is made by the collective of Product Development teams during the PI Planning Event, therefore the standard is anchored at the Story level within the PDev layer.

Stories

A story must be delivered within 1 Sprint / Iteration.   Ideally, the Story would require not more than half of a Sprint, allowing for earlier integration and validation of progress for the Feature.

T-Shirt Size

  • Small:  1,2, or 3 Relative Points in size
  • Medium:  3 or 5 Relative Points in size (the ranges intentionally overlap to reflect imprecise info available)
  • Large: 8 and possibly larger in Relative Points size (a.k.a. “at risk”)

When to use:  During Feature Refinement in-take working sessions, where stories are derived from a Feature.  This occurs during a PI’s execution timeline, typically -6 weeks or so prior to the PIPE.   See Story Mapping for more info.  Note:  Pre Planning meetings scheduled for PIPE2 are a “starting convention” which emulates Feature Refinement in-take.Relative Points  (consult the Presentation and Video)

This begs the question … What resides below the Story estimate which helps us to Correlate and Calibrate the way we estimate Stories?This is the Task work item type and estimate.Presuming a Daily stand-up ceremony, a common Estimation standard when determining what is “material and worth tracking” as a Task, we use 2, 4, 6, 8 hours to size the Tasks.   This is often referred to as Task-Based Sprint-Commit Planning (see Mike Cohn, Succeeding with Agile) Note:  this brings us ultimately back into Ideal Hours, which in turns helps to hone the Ideal Days as a Range Calibration standard residing at the story level.  When to Use:  During each official Sprints Planning Ceremony (possibly as part of Story In-take refinement).

Features

A Feature must be delivered within 1 Program Increment.   Ideally, the Feature would require not more than half of a PI, allowing for earlier integration and validation of progress against the corresponding SAFe Epic.Here’s one possible approach to estimating a Feature.   Establish a standard that represents approximate cardinality for a mix of Stories of certain sizes and presumed distributions across PDev Teams for a common number of Iterations.

T-Shirt Size

  • Small:  5 to 15 stories #, pursued by 1 PDev team across ~2 to 4 sprints.
  • Medium: 10 to 25 stories, pursued by 2 to 3 PDev teams across ~2 to 4 sprints.
  • Large: 25 to 40+stories, pursued by 2 to 3 PDev teams across ~3 to 5 sprints.

A less precise “starting convention” might be to use a multiplication factor of 10 against the established Story Relative Pointing standard. 

T-Shirt Size

  • Small: 1,2, or 3 = 1×10 to 3×10 points.   10 to 30 points, regardless of the complexity and distribution of the Stories in 1 or more PDev teams below.
  • Medium: 3,5 = 3×10 to 5×10 points.   30 to 50 points … ” ” “
  • Large: 8×10 = 80+ points

SAFe Epics

An Epic must be delivered by 1 Agile Portfolio (cannot be shared across Portfolios).   Ideally, the Epic (MVP within the Initiative’s Lean Business Case) would require not more than 2 Program increments, as a way of validating progress against Strategic Initiatives.

Here’s one possible approach to estimating an Epic (LBC’s MVP).   Establish a standard that represents approximate cardinality for a mix of Features of certain sizes for a common number of Iterations.   Note:  A Feature cannot span ARTs (Agile Release Trains) and the distribution across PDev teams is realized at the Story level.

T-Shirt Size

  • Small:  6 to 25 Features #, delivered across ~.5 to 1.5 PIs.
  • Medium: 20 to 35 Features, delivered across ~1.5 to 2 PIs.
  • Large: 40 to 60+ Features, delivered across ~2 to 3 PIs.

A less precise “starting convention” might be to use a multiplication factor of 100 against the established Story Relative Pointing standard. 

T-Shirt Size

  • Small: 1,2, or 3 = 1×100 to 3×100 points.   100 to 300 points, regardless of the complexity and distribution of the Features
  • Medium: 3,5 = 3×100 to 5×100 points.   300 to 500 points … ” ” “
  • Large: 8×100 = 800+ points

#Note on the calculation for Quantity of Cards at this level:  For the upper bound count in Range, determine the median relative point size for the artifact below this layer, and divide by the total potential points for this T shirt size by this number.  For the lower bound count in Range, determine the lower relative point size for the artifact below this layer and divide total potential points for this T Shirt size by this number.   Round up/down to create an intentionally overlapping range across T sizes, which reinforces the imprecise nature of info available during T-shirt sizing.This calculation is not precise science.   It’s a gauge which suggests, if we “realize max throughput rate, and optimize the size of our cards (at this layer AND derivation layer below) to conform between small to medium relative point range, then this might by ‘an’ early indicator of improving our Throughput rate, evident by ability to slice the work better.   We should NOT count cards.   We measure the Value of what’s actually delivered compared to what was requested (the Value Intent), as understood by the Agile Manifesto, the SAFe Principles, and conveyed transparently during Demos and Reviews.