This year I changed teams at the dayjob and and I've had some time to compare and contrast a few incident-response models that I've seen or participated in over the years. The big change for me was that I went from 4+ hours a day spent in meetings down to 2 hours a week. Not only has my productivity gone up, but I feel much less stressed and harried-- even when the incident at-hand is severe.
Common incident-response team models
I've seen incident-response teams organized into one of three ways. The choice of model seems to be dependent upon the size of the environment and the level of perceived threat. They can be described as:
Volunteer Fire Department
The one-man-band invariably occurs in small organizations where they have only one full or part-time IT person. I've been in the shoes of these jacks-of-all-trades and I don't envy them. They are either already master jugglers or facing a nervous breakdown. The only advice I have for them here is try to track your time spent on the following:
Building the Infrastructure (design, or new installs)
Maintaining the Infrastructure (upgrades, patching, what passes for trouble tickets in your environment.)
Defending the Infrastructure (maintaining security tools, training users)
Responding to when the defense fails (dealing with a compromise, or infected system)
This information may help you organize your budget or lobby for more help.
Volunteer Fire Department
An organization that is large enough to have a real IT staff, but not big enough or under enough (perceived) threat to justify full-time security IT staff falls into this category. Individuals with the appropriate skills or desire may be tapped from time to time to help respond to the periodic infection, or intrusion. Over time, as the threat grows or exposure grows, it seems like certain people are always responding to incidents, which is time to move on to the next model.
An organization with a dedicated security staff can also become overwhelmed with the challenge of balancing the constant flow of events, and ongoing improvement of environment security. Staff can either be pigeon-holed into tasks, or expected to know everything. Managing the personnel and their time can be very challenging.
What is standby time?
Although it may sound like it means stand around doing nothing, standby time is more like on-call or ready-to-serve time. Some organizations implement on-call time as that week or two that you're stuck with the pager so if anything happens after-hours you're the one that gets called. Otherwise known as the sorry family, I can't do anything with you this week time. As the organization grows, that will become less onerous as they move to a fully-staffed 24/7 structure with experienced people. That's not really what I mean by standby time.
Standby time is time that is set aside in the daily schedule that is devoted to incident-response. Most of the time it should focus on the first stage of incident-response, or Preparation. It's time spent keeping up to date on security news and events, updating documentation, and building tools and response processes. It's an interruptable time should an incident arise, but it's not interruptable for other meetings or projects.
Why Standby time?
Teams that are tasked with other IT maintenance tasks, in addition to incident-response will not take time to learn new tasks, or document their process, or pass on lessons-learned, unless there is time set aside for these functions. Incident-response is interrupt-driven and responsive (it's right there in the name.) If you don't give that individual time to step out of event-stream and gather their thoughts, you're not going to get the documentation out of them that you need. Similarly, if you also task your incident-responder with organizing and managing a long-term project, but don't give them time to organize and manage that project, it's going to suffer terribly as well.
Scheduling Standby time
Ideally, you should have someone on your dedicated incident-response staff on standby during your hours of operation (if you're big enough to have a full-time incident-response staff, you're probably 24/7 already.) This is not on-call, maybe-we-need-you time, but time spent in the office doing the job of keeping up to date and Preparing. Overlapping this time with other members will help in collaborating both tool-building and updating processes.
If you're trying to improve the documentation coming from your incident-response time, think of including daily chunks of standby time to give your team time to focus on preparation and lessons-learned, and process documentation.
(c) SANS Internet Storm Center. http://isc.sans.edu Creative Commons Attribution-Noncommercial 3.0 United States License.