OXFORD SCIENCE MEETING for THERAPEUTIC COMMUNITIES, St Hildas College, 31 March – 1 April 2008


St Hildas College

31 March – 1 April 2008

written up by Vanessa Jones,

Honorary Researcher, Thames Valley Initiative

Sponsored by:

Thames Valley Initiative...the host organisation,recently having established a network of NHS day-TCs

Association of Therapeutic Communities...the professional body representing therapeutic communities in UK

Community of Communities...the quality network and accreditation system for therapeutic communities

National Personality Disorder Development Programme...the English Department of Health’s programme to develop new services and training

Personality Disorder Institute...a part of Nottingham’s‘Institute of Mental Health’ committed to developing research and training in the personality disorder field

Consensus 1: methodological problems

  • RCTs of therapeutic communities are an essential aspect of the enquiry – with policy makers,funders and clinicians – for proper development of treatment programmes for PD,addictions and psychosis.
  • A democratic therapeutic community RCT has not been undertaken hitherto; potential barriers have included
    • The clinical community’s inability to hold a position of equipoise
    • Funding bodies’ reluctance to support PD research
    • Complexity arising from the involvement of clients in selecting each other and the consequences of this for waitlist controls, blinding etc.
    • Concerns about the internal/external validity trade-off
  • There is now sufficient data in the PD field to design a sequentially organised programme of randomised controlled studies
  • A small study to examine the effect of randomisation is the first step required in respect of PD: to both examine whether randomisation is possible, and whether it has a deleterious effect on the treatment programme
  • Reanalysing data using more sophisticated models may be helpful - in outcome and in

population selection, including severity (also see consensus 3)

Consensus 2: proposed Oxford study

  • Day TC treatment, as recently developed in several PD pilot sites, is a suitable method for an exploratory RCT for PD as a two arm trial – of “options followed by TC” vs “TAU + crisis plan”
  • Recruitment of approximately 80 participants could happen over 18 months; this is likely to be a suitable number for an pilot study
  • Fidelity should be rigorously established; this could be with checklist of criteria with an independent rater, possibly as adaptation of existing accreditation process
  • Qualitative analysis would be required for evaluation of the impact of randomisation on the treatment programme
  • Quantitative analysis would be required to demonstrate numbers progressing from the options group to the full TC programme before and after the randomisation
  • Baseline data which have already been collected should also be analysed
  • Primary outcome will be service utilisation costs, which need to be supported by hospital records (follow-up for up to five years has been agreed by the REC). Others will also be needed to permit comparison with other studies.
  • Service utilisation is a clinical outcome in its own right
  • A project advisory group should be established from members of this workshop

Consensus 3: alternative designs

  • RCTs are necessary but not sufficient evidence for policy-makers, clinicians and service users
  • There is insufficient evidence that high severity of PD requires more intensive treatment
  • Other approaches are required to explore the ‘active ingredients’.
  • Re-examining existing data to assess changes at different levels would be fruitful: time series, multi-level modelling, and other advanced statistical methods.
  • Different research designs could be equally valid at this stage of development of the field:
    • Econometric research
    • Service user research
    • Qualitative research
    • Action research

Questions on alternative designs

  • Can we measure what matters about ‘change’?
...Behaviour, certainly – but can we meaningfully and objectively capture ‘ internal change’?
  • How important are other methodologies for influencing treatment, policy and science?
  • What is the ‘killer fact’?

OXFORD SCIENCE MEETING: details of the discussions

Monday 31 March - Tuesday 1 April 2008

Those present:

Nick Benefield (NB)(Monday only), Eric Broekart (EB), Mike Crawford (MC), Mark Freestone (MF) (just Monday), George de Leon (GdeL), John Gale (JG), Rex Haigh (RH), Kath Harney (KH), Vanessa Jones (VJ), Eddie Kane (EK) (late Monday), Vasilli Maglios (VM), Steve Pearce (SP)(Monday only), Beatriz Sanchez (BS), Steve Sizmur (SS), Peter Tyrer (PT) (Tuesday only), Fiona Warren (FW), Min Yang (MY), Jenny Yeind (JY)

SESSION ONE: 2pm Monday


An introduction by Rex Haigh summarised the ‘history’ behind this meeting taking place. This led to possible definitions of TCs – Addiction Model, British Model, and one for all TCs.

Around the table introductions were made with each giving their own opinions on the question of whether an RCT was possible in TCs – and whether it was desirable. A range of views were heard with people strongly supporting the ‘political/economic’ need for RCT evidence of TCs, whilst raising concerns around the RCT process. It was generally agreed that an RCT can only give one type of evidence and what TCs needed for clinical reasons was a mixture of research methodologies, including qualitative studies. The new ‘day TC’ model was thought to be probably the easiest model in which to do an RCT.

A presentation was given by Fiona Warren on the topic “What do we know so far?” setting out a brief summary of TC research to date and then leading to a discussion of outcome methodologies. This covered areas including the vast heterogeneity of TCs, the difficulties in knowing what we are measuring/relevant outcome measures, the limitations of the studies conducted in TCs so far and the challenges in methodology for TCs. Overall, there were studies which gave useful information and should not be ignored, but none that reached the standard of evidence provided by RCTs.

This was followed by a presentation by Rex Haigh. Staring with a discussion of the ‘hierarchy of evidence” there was debate as to the value of service user/clinician opinion: this was considered important in clinical terms but not as evidence of efficacy. The systematic review by Lees, Manning and Rawlings was detailed, as was a systematic review focused on DSPD. This was followed by a presentation of the lottery project on multicentres: one major weakness uncovered in this project was the high attrition levels – starting with 313 people but falling to 15% of this. Three studies of economic evaluation were discussed, being those at the Henderson Hospital, Francis Dixon Lodge and the Cassel Hospital. All showed cost-benefit of TC treatment but were not of sufficient sample size or rigorous enough in methodology to be taken as ‘strong evidence’.

The session then opened to general discussion.

  • This started with discussion around the ‘length of stay’ – was it related to severity of symptoms (found in the US addiction TCs) or did people with more severe symptoms drop out of treatment (found in British PD TCs).
  • Perspectives from those not directly working in the field of TCs included the very real problem of small sample size and that with under 40 participants in a group, the study won’t be ‘taken seriously’.
  • The idea of re-focussing to all PD treatments in a planned environment (not just TCs) was raised.
  • The question of “when is it the right time to do an RCT” was much discussed, with ideas of maturity being needed and a series of studies leading up to an RCT, but also examples of treatments where RCTs have happened much quicker and the thought that maybe TCs have already waited too long. Example of 2 TCs set up on Henderson model - then closed before any RCT took place.
  • The effect of running a TC on the TC environment was questioned. Did randomising people disempower members and destroy the milieu? How can we assess the effect? Can we do a pilot study to assess the effects of an RCT on the TC treatment? Focusing on few treatment elements could lead to a ‘minimalisation’ of the service, which much being lost (example of Norwegian network).
  • Ethical questions on running an RCT were discussed. Since TC intervention has no real evidence base, in theory there should be no issue over ‘harm’. However, if staff have seen positive effects and ‘believe’ in TC treatment, can they really reach the right level of equipoise?

Session ended with tea-time.


Where are we?

Introduction – Rex Haigh

  • Starting at the World federation of Therapeutic Communities (WFTC) conference in Sept 2006 which brought together the following
USA: TCA (Therapeutic Communities of America); 6 regional associations, primarily addictions TCs, reliant on insurance money
Europe: EFTC (European Federation of TCs) & EWODOR (European Working Group on Drug Policy….) = scientific branch of EFTC
Australia: ATCA – supply, demand and harm reduction
South America: FLACT – children
Asia: FTC
Eastern Europe: FTCCEE
  • June 2007 a meeting was conceived for all TCs (not just addictions). Raised question “are certain research methodologies, especially Randomised Control Trials (RCTs)feasible for use in researching effectiveness of Therapeutic Communities? Are they possible, or desirable?
  • Definition: Addiction TC“a concept TC is a drug free environment in which people with addictive (and other) problems live together in an organised and structured way in order to promote change and make possible a drug free life in the outside society” (Broekaert 2006)
  • Definition: British Model TC “a consciously designed social environment and programme within a residential or day unit in which the social an group process is harnessed with therapeutic intent. In the therapeutic community the community is the primary therapeutic instrument” (Roberts 1997, 4) Community as Method approach
  • All TCs: first attempt at a definition: “A TC is a consciously designed drug free social environment in which people with various emotional problems live together in an organised and structured way. The social and group processes of the community are the method itself, and through them, change and recovery are promoted. In this way, a new life in outside society is made possible”

Discussion: what do people around the table think about the “is it possible to do experimental design research in TCs” and “is it desirable” questions?

  • A range of views were contributed around the table. There was strong support of doing an RTC: SP – “In Oxfordshire, 70 people in PD services. An RCT is both increasingly necessary and possible. Problems can be overcome (e.g. by using intent to-treat analysis to improve sample size).” Necessary to protect milieu of the TC when carrying out the RTC.
  • The view that the future funding of TCs was in danger and needed to be supported by rigorous research was generally agreed. Some expressed the opinion that RCTs weren’t that useful for informing practise and were as “dull as ditchwater” (MF) but agreed that the lack of such research was damaging in today’s “evidence-based” climate of funding.
  • Other research alongside RCTs would be beneficial for informing clinical work and understanding – this could be qualitative research carried out in parallel. There was strong interest in evaluating complex interactions such as the TC program and that research could be part of a wider category of understanding. “We can’t put all our research eggs into the same basket” (MR)
  • Difficulties in doing an RCT were acknowledged, in particular the potential for the TC itself to be affected by the RCT taking place. However, the new ‘day TCs’ offered an easier model to carry a TC in whereas other settings (particularly forensic settings) were much harder.

What do we know so far?

Fiona Warren

This presentation gave a brief summary of TC research to date and then lead to a critical discussion of outcome methodologies

1.Heterogeneity of TCs We have a definition, & Community of Communities standards, but there are many substantial differences between TCs: 'Democratic/hierarchical; Residential/day; Individual treatment/no individual treatment; Inpatient Psychotherapy/TC; Medication/no medication; Members involved in selection and discharge/ not; Non-secure/secure; Personality disorder/psychosis' What is it we are testing?

2.Evidence of effectiveness Where are we coming from? Do we ‘know nothing’ or do we accept the quasi-experimental studies to date? Some evidence for PD treatment but not TC?

  • No consensus as to outcome measure – yet key issue. Was also an issue for NICE in their setting of guidelines. Contains many things including use of services, forensic & mental health, etc.
  • Comparison need to be with ‘treatment as usual’ but what?
  • Need medium term outcome: 1 – 5 years post treatment which is good.
  • Some evidence from residential TCs – none from the newer Day TCs.

3. Limitations of studies Studies have looked at PD treatment – not just TC effectiveness. Best research designs were found in pharmacological studies in which samples tended to be described well, if not selected as well as could be. The studies in higher security settings and with the treatments more tailored to the individual had poorer description and selection. They also tended to have poorer methodology in general. No studies in higher security settings discussed risk. Limitations include: Case identification: selection of patients; Axis II co-morbidity; Axis I + II co-morbidity; Poor descriptions of treatment ; Poor descriptions of samples; Few experimental or quasiexperimental studies; Wide range of outcome measures; Non standardisation of outcome measures; Lot of self-report; Short follow-up periods; High attrition rates from treatment and research; Small samples; Application of statistics

4. Methods challenge for TCs Challenges to meet the limitations outlined above. The majority of studies have been ‘naturalistic studies’ What are the criteria for conducting a valid RCT? – “Complexity”; Time; Ethics; Choice of outcome; Control condition So can we reach a consensus on what we know?

Rex Haigh

Some slides covering what may be familiar to people already.

1. Hierarchy of evidence

  • Type I evidence – at least one good systematic review, including at least one randomized controlled trial
  • Type II evidence – at least one good randomised controlled trial
  • Type III evidence – at least one well designed intervention study without randomisation
  • Type IV evidence – at least one well designed observational study
  • Type V evidence – expert opinion, including the opinion of service users and carers

Debate re Service User opinions: RH In policy terms, SU opinion is valued but not so in hierarchy of research evidence? It may be considered in the same ways as self-report or clinicians opinions. But we do have to make sure we are measuring things that are concrete not opinions

2. NHS Centre for Reviews and Dissemination - Systematic Literature Review & meta-analysis

  • Post-treatment and in-treatment outcome of therapeutic community treatment in secure or non-secure democratic therapeutic community settings for people with personality disorders or mentally disordered offenders Lees, J, Manning, N, Rawlings, B, 1999
  • Included 10 RCTs
  • Carried out a Meta Analysis with an overall summary log odds ratio is -0.567; 95% confidence interval -0.524 to -0.614;indicates a strong positive effect for TC treatment.
  • Odds ratios calculated separately for the RCTs, and for the democratic, concept, and secure types of communities all show strong results, with upper confidence intervals well below one

3. Another systematic literature review – focused on DSPD

  • Reviewed therapeutic community; cognitive, behavioural and cognitive behavioural; psychodynamic psychotherapy; pharmacological; and physical treatments for people with PD in general and for dangerous and severe personality disordered offenders.
  • Made clear recommendations about the most promising treatment interventions for PD in use or currently in development being the TC model.

Comments were made about the methodology of this study. It was agreed there are problems with heterogeneity and that this study would probably not be acceptable within today’s definitions – e.g. not recognisable diagnosis categories. The question was raised as to which model was used for the meta-analysis – a fixed effect or random effect models? For this study the researchers decided there was too much heterogeneity to do a Meta-Analysis, so left it as a Systematic Review.

4. Multi-centre Research 1999-2003

  • Association of Therapeutic Communities/National Lottery Charities Board Therapeutic Community Research Project – “A comparative evaluation of therapeutic community effectiveness for people with personality disorders”
  • … a naturalistic, comparative, cross-institutional study in the field of therapeutic communities for people with personality disorders.
  • Started with 313 people (60 TCs). TCs had problems collecting data. Data quality not that good. By 9 months down to 15% of original numbers. Much better response where TC had in-house researcher.
  • Included Three Qualitative Studies in NHS non-secure; Prison Service secure; High Secure Hospital
  • Measured characteristics of the Treatment Programme: Postal questionnaires to purchasers, referrers & line managers; Community questionnaires; Programme timetables
  • Looked at Residential Substance Abuse and Psychiatric Programmes Inventory (RESPPI); Policy and Service Characteristics Inventory (PACI); Physical and Architectural Characteristics Inventory (PASCI); Resident Characteristics Inventory (RESCI); Rating Scale for Observers (RSO); Rank Ordering of Activities (staff);
  • Baseline measures included: Social History Questionnaire; Personality Diagnostic Questionnaire (PDQ4+); EuroQol (EQ-5D); Brief Symptom Inventory (BrSI);Borderline Syndrome Index (BoSI); CORE; Rank Ordering
  • Outcome measures (in-treatment) included: Community Oriented Programmes Environment Scale (COPES); Qualitative studies; Brief Symptom Inventory; Borderline Syndrome Index; CORE; Rank Ordering; (EuroQol EQ-5D); Qualitative studies
  • Post-treatment measures included:Brief Symptom Inventory; Borderline Syndrome Index; CORE; Rank Ordering; (EuroQol EQ 5D); Reconviction; Readmission; Qualitative studies
  • Also considered “Who goes into TCs in the UK?” analysed by type of community: Day NHS; Residential NHS; Prison; Addiction; Private & voluntary and by Axis I and Axis II diagnoses at admission

5. Economic evaluations 1: Henderson

  • Dolan (1996) at Henderson, n=29:
  • Treatment cost = £25,461 per patient.
  • Total psychiatric and prison costs for the year before treatment were £335,196; for the year after treatment £31,390; Average cost-offset of £12,658
  • If maintained, would mean the Henderson treatment would pay for itself in just over two years.

6. Economic evaluations 2: Francis Dixon Lodge, Leicester

  • Davies (1999) at FDL (n=56)
  • Service usage three years before & after treatment
  • Psychiatric bed days: 74 to 7.2 (ECRs); 36 to 12.1 (locals)
  • Average saving of £8,571 over three years following treatment.

7. Economic evaluations 3: Cassel Hospital

  • Chiesa (1996) n=26: admissions compared to post-treatment group, cost offset of £7423 per patient
  • Chiesa (2000): Intensive programme compared to short programme with psychosocial outreach nursing: “randomised” by M25, Outreach significantly better
  • Both much better than N Devon “CMHT treatment as usual” control group So what do we know? TCs save money! But difference between resource use and cash use.

Opened to general discussion.

EB: Need to start with “quality of life” as the outcome. Different scales can help us measurethis..

GdeL: Idea of using other evidence based strategies to improve retention etc.

MC: I am cautious about the notion that longer treatment = better outcomes General discussion around retention, length of treatment, real randomisation as opposed to the Cassel study, different populations need different length of treatment times. Does dose=length of treatments. Correlation between severity of symptoms and time needed (evidence from US addiction TCs)? Which client needs how much time to produce outcome X?

FW: Equivocal about length of stay. Cassel study showed shorter length of stay + outpatient maintenance of psycho-social nursing (6+6) more effective than long stay (12 months).

MF: However, people who had outpatient treatment were also those living closer to Cassel – this is a confounding variable. Other factors may contribute to outcomes. Different groups need different length of times. E.g. in Grendon need 18 months?

FW: Don’t really want length of stay to equate to outcome – people should leave when they’ve ‘had enough’.

GdeL: Done research on length of stay. Proxy variable = dose = time? 12 months is ‘critical’ dose. High severity addictsdo not do well with shorter residency & outpatient - higher severity needs longer stay.

MF: In UK the most severe/disordered clients drop out so not completing treatment/no research data for them.

GdeL: In substance abuse TCs crime levels also correlated with length of stay – need at least 1 year to show reduction for more severity of criminal behaviour

RH: Maybe we can start to summarise what have we learned already? What do those researchers etc outside the TC field gather so far?

GdeL: All this has been about TCs for PD – what about evidence for addition TCs?

JY: My perspective on discussion so far: in the list of problems, sample size is probably the hardest problem to deal with. A few studies with small sample sizes are not convincing. Plus there simply aren’t enough studies – no-one will be convinced on results from just one1 study. Although there may be confusion over what outcome measures should be used, etc., no matter how good design is, poor sample size can scupper the results. Need to amass studies together.

FW: Except that if your outcome measures are not focused towards sample size then I think there is enough evidence.

JY: So why are the attrition rates in these studies so abominable?

FW: One reason could be the commitment of people collecting data – clinicians who are already over stretched with clinical work simply stop collecting data.

RH: We have tried to learn from the lottery project – now in the TC Research Network (TCRN) the clinicians and service users have really become involved in the process and ‘own’ the data themselves. People in each service play an active part.

JY: Am I right in understanding the maximum number of people in a TC is about 18? To get a study published in a top journal you need more.

MC: You need a sample of a minimum of 40 which we got in a psychotherapy/PD study. If the effect size is big enough you only need a small sample size.

NB: I think we need to re-focus. We don’t have that many TCs fully operating and so we can’t do a full scale multi-centre trial. This is particularly true if we keep to a narrow coherence of what “TC” means. We could look at PD services as a whole, not just TCs, and take the “treatmentelement” (using the planned environment as treatment-model). This would open up fields to the critical element of “TC-ness” being its use of environment as treatment. If we stick only to TCs there are few, all with small numbers. So focusing on the ‘planned environment’ could include other treatments than TC. For example we are currently running 2 other RCTs – one in multi-systemic theory and one very small in ‘CRACKEMS’ which is a step-down project for people coming out of high secure services. These are tiny samples again – more like feasibility studies? But we thought it was worth having a go.

DdeL: Very important point we’ve looked at historically – when is the right time to carry out an RTC in a TC? In medical trials, an RCT only happens after much is known about the treatment and the disorder – taking as much as 75 years. In a maturing field we need to first understand the disorder, then understand recovery before carrying out the RCT.

NB: We have 2 things here: the therapeutic project “do people with PD get better in TCs?” and the political project to get funding/evidence that will be accepted by ‘government’ Tension is how do we get government to manage that dilemma – is it the right time when we acknowledge at the start that TCs and planned environment treatments are about to go down the tubes unless we make some case for evidence.

MC: Isn’t this a little bit pessimistic – Linehan didn’t wait 70 years to do an RCT on DBT and showed effectiveness for women that self-harm, people with Borderline Personality Disorder. Bateman didn’t wait 75 years to test mentalisation treatment and showed it had major impact. TCs have waited a good many years, despite really good documents building up an understanding of the effectiveness in the context of personality disorder – maybe we have already waited too long? In different parts of the country they are not setting up TCs they are setting up DBT services since that is where the evidence is.

GdeL: There have been DBT RCTs on eleven sites with sample size over 100.

GdeL: These are not irreconcilable positions. Encouraged by what I’ve heard there is a general similarity with studies showing effects (if questionable methodology) and the reporting is consistent with the addiction TCs. This is evidence for effectiveness (people do change) but level of proof low – RCTs would be a higher standard of evidence. An RCT can be implemented for a TC for PD but it would be small – purpose to test the feasibility of running an RCT. Then run sequential trials, systematically related to each other (each informing the next). So I would recommend that you identify 1 or 2 models that are high fidelity models – this is the treatment element by element

RH: I hope we have that through developing high fidelity measures through the setting of standards by the Community of Communities.

GdeL: The models in the field are very different from the specified theoretical model – but setting verifiable standards is on the right track. But we need the ‘policy makers’ to appreciate how the science is working in the small, and so use the sequential trials method.

FW: It’s very interesting to hear you say that, and I just want to put something in from the experiences at the Henderson. On the basis of evidence two new TCs were set up that were subjected to fidelity checks and process of setting up to be written up – with the understanding an RCT would be carried out. The services closed before that happened. The same thing could happen again.

GdeL: This is why you need a partnership with policy makers?

SP: Do we have consensus that it is possible to run a TC in an RCT? Is there a stumbling block inthat to randomise people going into a TC dis-empowers the members and destroys the milieu?

RH: No, we don’t have consensus on this yet – it is for discussion in next session.

GdeL: The first trial should be used to look at the perturbance of doing an RCT on the TC process itself so we find out either how to solve the problem or that the problem cannot be overcome.

MC: I agree we need to do small RCT to see any effect on treatment process.

SP: Agreed – but curious as to why for the last 20 years there has been a ‘TC Consensus’ that RCTs are not possible. Has an enormous problem just gone away?

GdeL: People also give other reasons why we can’t do an RTC.

JY: How do you establish effect of RTC on a TC? MC: Use qualitative methods to research effects of randomisation. Get the views of people in the TC before the study. Need ‘treatment as usual’ to be sufficient. It is possible that patients will value an RCT in that it does good to others – particularly if it is central to the survival of the model.

RH: We have started that with a Google discussion group on TCRN?

NB: With the CRACKEM project we had a problem with ethics – parole board release people ‘subject to attending a TC’. Also, ‘what right do we have to randomise treatment when such harm can be done?’ But if intervention has no evidence base, in theory there is not issue over harm,

EB: Would this be in one definable part of treatment? Need to define all the treatment elements.

GdeL: In addiction TCs we have taken 30 years to hypothesise the importance of different elements. Have to recognise what important elements are – minimum threshold of essential treatment.

RH: Would this cause ‘drift’ towards minimalisation? As happened in the Norwegian Network experience? There 15 day units started as TCs, they researched different element and ‘thinned it down’ to group therapy. But it has proved less useful since it can’t hold people with severe symptoms.

MC: In this example, there was already much lost in the day treatment. There is the ethical issue as to whether randomisation is right. Morally is there a question of equipoise? Clinical staff have seen positive effects of TCs, believe in the intervention and cannot stomach equipoise?

NB: Whole idea of what we see involves complex processes re staff in as an active agreement. KH: If people for good or bad reasons are hostile to randomisation then the pilot has to be done as ‘not a baby RCT’ but as a study as to what needs to be done before an RCT.

NB: There are whole ranges of possibilities– for example we know of attrition for people coming out of prison into ‘anything’. What we’ve had to look at is the preparatory stage of engaging with people in prison so they fully understand the participation needed when they come out.

RH: We are at tea-time. And starting to move into the questions of the next session…

SESSION TWO: 4.30pm Monday


George de Leon started this second session with his presentation “A Rush to the Gold Standard”. He introduced research from Addiction TCs in the US, being ‘field effectiveness trials’ for large populations over 40 years, and detailed the distinction between research evidence and ‘evidence-based treatment’ (i.e. treatment utilising evidence based principles). The different levels of evidence were discussed. Utilising the stage/phase model used in US medical research, he described how stages 1 and 2 were met by the evidence produced from field studies, and that questions were formulated at this stage being: Who comes for treatment? What are the success rates? Is there a relationship between Treatment “Dosage” and Outcomes?

There was then an assessment of what was required for an RCT in a TC, considering elements of treatments being operationalised, high fidelity maintained and an appropriate control/comparison condition being found. The assessment process, with outcomes equalling ‘proof’ were discussed, in particular in relation to the “self-selecting” factor. The presentation finished by looking at appropriate analytic design and some final points. We have no “evidence” from an “RCT” as to the effectiveness of University or of the ‘good family’ – do we really need RCT evidence as to the effectiveness of the TC? The discussion was then opened up to questions and comments. Issues such as how to operationalise and measure ‘self-selection’, sample sizes, minimising TC elements in the control treatment and using wait-lists as control treatments were discussed. The idea that to attempt to resolve all possible issues and to aim for ‘perfection’ would mean that an RCT in a TC would never get done – pragmatically we need to treat the TC treatment as a ‘black box’ for the initial RCT. Differences in problems between community and prison settings were also raised with an idea of using different intensity of treatment in prisons as a possible control.

At 5.45pm, the group started to generate a consensus statement for the session. It was agreed that levels of existing evidence were sufficient to now carry out an RCT but potential difficulties such as the ability of clinicians to hold a position of ‘equipoise’ and problems in receiving funding were also noted.

Consensus on methodological problems

  • An RCT is an essential aspect of the enquiry – with policy makers, funders and clinicians – for development of treatment programmes for PD, addictions and psychosis
  • RCTs of therapeutic communities are an essential aspect of the enquiry – with policy makers, funders and clinicians – for proper development of treatment programmes for PD, addictions and psychosis.
  • A democratic therapeutic community RCT has not been undertaken hitherto; potential barriers have included
    • The clinical community’s inability to hold a position of equipoise
    • Funding bodies’ reluctance to support PD research
    • Complexity arising from the involvement of clients in selecting each other and the consequences of this for waitlist controls, blinding etc.
    • Concerns about the internal/external validity trade-off
  • There is now sufficient data in the PD field to design a sequentially organised programme of randomised controlled studies
  • A small study to examine the effect of randomisation is the first step required in respect of PD: to both examine whether randomisation is possible, and whether it has a deleterious effect on the treatment programme
  • Reanalysing data using more sophisticated models may be helpful - in outcome and in population selection, including severity (also see consensus 3)
  • The following table shows the current research phase of the various types of TC:
    • Stage of evaluation

PD community PD forensic Addictions Psychosis

Phase 0 (descriptive)

Phase 1


Phase 2

(pilot trial)

Phase 3

(effectiveness trial)


Why are RCTs such a problem for TCs?

Methodological complexities

Underlying difficulties

George de Leon – Presentation “Rush to the Gold Standard”

  • Introduces research from Addiction TCs in the US.

These are basically ‘field effectiveness studies’. Some criticism that ‘TCs’ do not have evidence based studies. The evidence of 40 years observation of the change in people through TC treatment is dismissed since it is not of the ‘gold standard’

  • Therapeutic communities (TCs) for addiction treatment arose outside of the medical,

mental health and scientific mainstream in response to unmet needs.

  • A considerable knowledge base about TCs in particular and drug treatment in general

has developed primarily from field effectiveness studies rather than controlled efficacy research.

  • In the universal call for evidenced-based treatments some critics have concluded that

the TC is not an evidenced based treatment.

  • Forty years of observational and quasi-experimental research is dismissed as not

meeting the gold standard of research design (hence the sardonic title to this talk). Given the relative lack of randomized, double blind control trials, it is asserted that the effectiveness of the TC has not been “proven.”

  • Introduces research from Addiction TCs in the US. These are basically ‘field

effectiveness studies’. Some criticism that ‘TCs’ do not have evidence based studies. The evidence of 40 years observation of the change in people through TC treatment is dismissed since it is not of the ‘gold standard’

2. Evidence: Some Distinctions

There are different standards of evidence levels: Expert driven but not tested; consensusbased; meta-analyses or systematic reviews; and consistent findings across populations, settings, and treatments. There is a distinction between research evidence and evidence based treatment (treatment utilising evidence based principles). Evidence Based (EB) Practices; Specific strategies which have been tested and recommended for use which lead to EB(T) treatment and (EBPg) programs. There is now an intelligent movement in the USA that says what we need is to see that a treatment is utilising good evidence based principles, rather than re-documentation of a particular treatment to say that the treatment has enough evidence. We use best knowledge and evidence from research combined with experience to implement what we are doing in treatment. This represents a maturation of the field.

Evidence Based Principles refer to broad guidelines e.g., use of behavioural and cognitive methods, targeting individual risks and needs, monitoring and accountability, continuity of care. These guidelines do not tell programs what specific interventions that providers must use, but instead the principles they must follow.

3. Evidence and Research Design

Looking at evidence and research design, in the US there is a managed ‘stage-phased’ model for research and developing medications for treatment. This has been adapted by NIDA for developing psychosocial and behavioural treatments for substance abuse disorder. This model directs steps from early stages demonstrating initial feasibility, through efficacy and then field effectiveness studies (with various steps leading from small scale to larger studies). Each stage has its own evidence – so build up ‘evidence base’: contradicts the idea that if we don’t have an RCT we don’t have an evidence base.

Each stage has a different research questions, a stage-appropriate research design, and produces evidence to validate undertaking the next stage of research. Thus, the term evidenced based should be understood to be related to the stage/phase of research development. Task for today, with PD, is the ‘science’ question: Do we have enough existing research to warrant an RCT? Are we very prepared – do we know enough – to do an RCT? Have we enough information to prevent an RCT being a high risk strategy? Are there too many worries to get close to an RCT design? Can we use a sequential approach to work on some of the issues?

4. TC RESEARCH STAGE 1 AND 2: Evidence from field studies (Some Meta Estimates)

Summary of evidence from (mainly) US TC research – Stage1 & 2 (non-randomised). These are meta estimates are from the outcome studies only - field studies, not controlled or randomised. Over 60,000 admissions to community and institutionally based TCs world wide entered into multimodality and single program studies (1969-2000). From these, over 10,000 individuals followed up to 12 years post treatment. The data were collected by different research teams across different years and different cultures. They assess multiple outcome variables with similar instruments, follow up and statistical methodology. In all these follow-up studies (across the US, Europe and other areas), the instrumentation was roughly similar as was methodology. One of the earliest studies was in a big program at Phoenix House in New York where outcome measures were scored on 550 people. This was a single programme study but informed instrumentation and follow-up methodology for other, large multi-modality, multi programme studies. The three critical outcome variables were always the same, that is: drug use (endless examples of the drugs used and not used), crime (massive variation in seriousness of crime, criminal justice verdicts – e.g. arrests, conviction rates, incarceration. These could be considered purely behavioural outcomes, not really psychological outcomes. But many studies did look at other questions about psychological status. The results are strikingly similar yielding “lawful’ findings with respect to profiles (who the client was/severity/great numbers of entry variables), outcomes and retention. One of best outcomes always came from single programmes (less problems with fidelity) – e.g. Phoenix House where 75% of 100 graduates criteria of success. All drop-outs followed. This was only a field-effectiveness study – not an RCT – but produced a lawful relationship that showed success as a linear function of the lanes of stay in treatment. There was only a 5% variation across cultures and countries.

5. TC RESEARCH STAGES 1 AND 2: The main questions and conclusions.

So here is the summary of the main questions for this research: 1st question asked in 1965 for addiction TCS: Who comes for treatment? Who are these clients? Concluded from the profiles of admissions that clients who go for treatment in long term residential facilities are the most severe in terms of drug use, criminal activities, psychological problems, social deficits and so on. Next question was: Do clients change? What are the success rates? Evidence showed that individuals change during and following treatment (but no answer as to ‘why’) Then we asked is there a relationship between Treatment “Dosage” and Outcomes? Evidence shows that retention consistently predicts outcomes – this is the most consistent finding in all the drug-treatment literature. The next stage for research – the effectiveness of TCs - is grounded in these 30 years of evidence from field outcome studies and validates the decision to try an RCT. However, such trials, in the field, are difficult to assemble. We can know look at some of the methodological points very relevant to our discussion

6. Requirements for an RCT in field settings (Therapeutic Communities)

1) There has to be sufficient evidence from Field effectiveness studies (i.e. stages 1 & 2)

that can provide a true test of the hypothesis that TC treatment is effective. An RCT would then move the knowledge base along.

2) The theory and active ingredient of treatment are explicitly articulated,

operationalised, and measurable. Without this, we cannot run an RCT since we can’t be sure what treatment is giving the outcome effect.

3) Have tried to summarise the TC treatment as “Community as Method” – did this years

ago for policy makers! Moved terminology away from testimony to methodology. Have to specify what is meant by this: to do an RCT need to ensure the treatment model is implemented with high fidelity. This requires fidelity assessment methods/parameters/thresholds. For “Community as method” we need confidence in consistent fidelity. This is hard in long-term complex treatments. Fidelity is easier to achieve in single treatment methods (e.g. DBT).

4) An appropriate control / comparison condition is essential for an RCT. This includes:
  • Equivalent residential settings (note that residential setting does not equal treatment,

setting is NOT treatment).

  • Same planned duration of residency (vary or post hoc control) – there may be a

possible correlation with success of treatment.

  • The retention/completion rates are equivalent between treatment and control groups.
  • Client profiles (social, demographic psychological, drug severity, social deviancy,

motivation, legal coercion) should be matched between treatment and control. Randomising can compensate for this but only with large sample size - a small randomised study with a sample such as 30 won't give this. Motivation is a major factor

    • I will come back to this later.
  • There should be minimal overlap between the comparative conditions with respect to

TC element. It is very difficult to keep any the control group "clean" - issues such as community structure, community meetings, shared meals, community expectations for participation, community privileges and sanctions, and peer accountability activities must not be part of the control group. This is particularly the case for residential control groups.

  • Staffing pattern should be equivalent (e.g. client/staff ratios; recovered/non


7. Assessment

For the treatment process, it is essential outcome measures relate towards establishing "proof”. It is also desirable that not only do we have a good outcome measure, but a good measure of process (but can pass on this in early stage RCTs). The finding that convinces people (and gets close to the ‘Gold Standard’) is the analysis of elements of treatment and client change and these relate to outcomes – indicating the causal explanation we want. A major factor is that of self-selection, including differences between those who refuse or accept randomisation and this also need to be assessed. Client has own selection: treatment cannot work unless client makes that selection. A major focus of treatment is to cultivate and reinforce self-selection factors such as motivation. Research must identify, not whether, but how selection factors contribute to treatment effectiveness and how treatment elements enhance these factors. You need to operationalise the factors of motivation and "ready to change".

We know from substantial literature on clients’ motivation and readiness for treatment that it was clinically obvious that individuals who come for treatment are different from those that don’t; individuals who stay in treatment are different from those that don’t; and the individuals who finally get well are different from those who don’t. The selection factor is universal and pervasive – and a major problem for research design. However, rather than seeing selection as a problem for design it should be considered as a prerequisite for treatment. The client who comes into the community – and who uses it to change themselves – they have to decide whether to listen, whether to participate: that is the selection feature and the treatment cannot work unless there is that selection from the client. Therefore self- selection is a critical variable that makes treatment work and so has to be measured and not misinterpreted.

JY: Is intervention group self-selecting?

GdeL: No. Have to take into account with randomization. Ethical considerations mean that clients are fully informed as to what they are being randomized to. We need acceptance by the client – could get high refusal rate by clients – so need to factor willingness to be randomized. We could introduce aftercare as a DV or into methodology/treatment.

Need to ensure that after care/post-treatment interventions/ services for both comparative conditions are available, equivalent, utilised and assessed at the same rate (unless decided to make a DV). Also, the comparative design must be appropriate for the complexity of the treatment; eg. MMT or CBT are less complex interventions than “community as method.” Simple but obvious distinction!

8. Appropriate Analytic Designs

Final consideration for control trials is how to analyse the whole issue of treatment process? One way is a ‘Deconstructive Design’ where you take out one element of the TC treatment that is thought to be essential and find out if it was indeed an active ingredient. By isolating single ingredients by subtraction from the whole approach will “prove” their essentiality (e.g., eliminate encounter groups, community meetings, sanctions/privileges). However, although this can show critical, active ingredients of treatment it will alter, or perturb the integrity of the treatment model. A more preferred way would be to use out clinical experience and what research already exists to inform us as to what we think the active ingredients are and then enhance those ingredients (e.g. running 6 groups a week rather than 3 groups a week). This is an ‘Enhancive Design’ and whilst it informs about single ingredients by addition or intensifying the magnitude it also perturbs the integrity of the model The third (and best) option to really analyze any complex treatment like TCs is to use a ‘Synergistic/Global Design’. This allows us to really answer the question of what it is about the treatment that affects the ultimate outcome by dealing with and assessing the dynamic and reciprocal interactions among the theoretically articulated ingredients. For instance, a client goes into a group, leaves a group, attends a seminar, talks to a fellow member – all these are activities in the community setting which impact on the individual client. To try and isolate any one of these seems too simplistic and too perturbing. However, whilst this design retains the treatment model, the analysis is extremely complex.

9. Final Points:

  • Level of certainty: Control vs. the “Weight” of Evidence

This is extremely relevant to us. In the US we talk about ‘weight’ of evidence, not ‘certainty’ of evidence. The argument is that we have enormous weight of evidence from those field studies with very high sample sizes and we know that people change. This has been enough to keep this research approach funded in the US even though it has always been criticised for not meeting the gold standard.

  • Global Models of Socialization: TC “treatment” as An Educational or Family Model

When you think about an approach such as a TC using ‘community as method’, as opposed to a straightforward manualised treatment, we are really talking about the ways the community can beneficially affect the individual.

There has never been an RCT on the effectiveness of University education – yet there are no arguments that it ‘works’, even though they are allowed to pick the best clients! We have lots of ideas about what makes a ‘good family’ and how beneficial that is – but there gas been no RCT on this effect of good families.

Eddie Kane (Chair) opens up discussion for thoughts and questions.

EB: Are we making a mistake in trying to compare observable and the unobservable? If self selection is an action, and treatment is an action, does this lead to behaviour which is observable?

GdeL: The term ‘self-selection’ represent a client in a certain status and is indicated by verbal behaviours (showing motivation etc). We have indirectly operationalised self selection and motivation and clinically measure this along the way.

SP: A lot of what you were saying applies to all complex interventions and there is technology out there for dealing with that. Issues like maximising effect size and needing to minimise TC bits of control treatment – we can worry about this for ever and it may never be solved. I didn’t think anything you said was an argument against running an RCT – it may be an argument for delaying it and possibly that’s been happening. My approach would be not to worry about what is it in a TC that is effective – treat it as a ‘black box’ - try and minimise elements of the ‘black box’ within treatment as usual – but recognise that if you attempt perfection you will never do it! So the pragmatic view is to take the TC as a "black box" and test it against what would otherwise happen to show effectiveness.

GdeL: I’m not sure there is enough discussion and concern about explicit problems. We need to consider "risk" involved with randomising a highly motivated, severely disturbed client? I could not do that in good conscience. Difficulties on ethical side as well as science side. We need to take the path of promising less and doing it in small increments.

MC: This is the key obstacle in the TC field and more of an ethical problem than scientific. At the scientific level randomised to ‘next to nothing’ is best – but was not considered ethical. This compromised logistical issues – we need funding for a control treatment where none currently exist.

NB: Agree that "treatment as usual" is often nothing for many people in this country.

MC: But no one would agree to randomisation if the control was nothing. If we can have funding for a researcher to identify a PD population which is double the maximum size the TC can take, then the possibility of “randomisation to nothing” is easier.

JY: Why is it not considered worth thinking about the control group having treatment afterwards (wait-list control).

MC: Problem is treatment can be five years - 18 months minimum. So a wait-list is not really feasible?

MF: Very interested in wait-list controls: in DSPD the intervention is very bed limited, we have a population 3000 for 300 beds. Does it matter how we decide who goes in there since the client will be sitting in the prison for the next so many years anyway? Less of an ethical problem due to the service constraints.

GdeL: We could use mature programmes that can count as treatment is usual - then the treatment condition can be enhanced in some way in the treatment provided at another TC. This way all clients get some treatment.

NB: Would that the enhancement of same element of intervention or different a intervention?

GdeL: For example we could train peers as role models – this could be an enhancement. NB: That doesn't measure a TC per se, just a critical element.

MC: In terms of the UK: it is a ‘do something or nothing’ position rather than ‘do something or do something better’. It is different in prisons than in the community. More ‘set treatments’ in prison – often to satisfy the parole board. But in the community resources are scarce.

KH: There isn’t actually a menu of PD treatments in prison, is there?

MC: There is in effect: the most intensive intervention is in a DSPD unit, the second most intensive intervention is in prisons like Grendon and TCs like that

KH: Maybe we could use those different intensity of interactions as a control.

Rex Haigh: starts to formulate the consensus statement.

RH: So far I have 7 headings: randomisation, ethics, other limitations, design, treatment integrity, attrition, and outcome

KH: Do we have a consensus that an RCT is a desirable thing in TCs?

NB: Or are we accepting a staged design?

EK: We have to deal with an RCT at some point

SP: Are we looking at TCs for PD only, or different fields?

MY: Depends on whether they can meet the criteria for an RCT.

SP: We don’t have to tick all the boxes straight off. We can first run a feasibility study as to doing an RCT

GdeL: We have sufficient research to do an RTC in a TC for PD. There are requirements for a randomised trials – threshold criteria.

MC: Active stance of intervention – does motivation to treatment = motivation to be randomised?. How can randomised selection form a community? We need to find out how people feel.

NB: We will have to run a TC in different lines of selection.

MC: Need to be supportive in the control condition: maybe a clinician from the TC to give control treatment?

NB: That would work from a PCT perspective. But will we have professional, clinical staff who are prepared to work this way?

SP: Members are not necessarily ‘anti’ an RCT. They realise it might be necessary for our survival. The staff could be a problem – and possibly referrers?

MY: We would need to consider these as covariates within the population. My preliminary view is we would have to analyse the data on a multi-level random effects model.

NB: Is there a question about generalisability? Does the profile of people we use allow generalisability (policy often forces only the most sever are treated).

JL: Need to decide are we narrowing this down to PD or addictions or keeping diagnosis open?

MC: Also whether we are going to use a TC in the community? Or in a prison?

Consensus reached

SESSION THREE: 9.15am Tuesday

