ConditionedReinforcement - version_tracker.txt
----------------------------

Version numbers are stored in:

(1) .iss file (influences setup display, Control Panel)
(2) .rc file (About box)
(3) g_strVersion
(4) strConditionedReinforcementXMLConfigVersion
(5) help file (.hm3)
(6) web site whatsnew.shtml

Version history:

0.1
========================

- 25 October 2004: copied from PIT task.
  Task requested by Joff Lee.
  Requirements:
	CONDITIONING PHASE
	- no levers
	- nosepoke for (CS + reinforcer) on an FR1 schedule
	- CS = L or R lever light (solid, not flashing)
	       houselight off
		   20 s
	- option for timeout following reinforcer (or CS, equiv. to same thing)
	- nosepokes during CS have no programmed consequence
	- options for maximum reinfs (default 50) and session length (default 90 min).

	CONDITIONED REINFORCEMENT PHASE
	- both levers extended
	- press on lever below CS reinforced with 1-s version of CS (no reinforcer)
	- other lever has no programmed consequence
	- 30 min session
	- no maximum to no. CSs delivered

	NONCONTINGENT CS PHASE ("for a reconsolidation experiment")
	- no levers
	- 20-s CS presented on FT or VT schedule

- Note RCS time/datestamp problem on several files (why? NTC service went wrong?) -
  some files had first revision date as 10 April 2005. So to check them in
  I advanced the computer clock by a year (to 25 October 2005)...
- Fixed only by manually editing/resaving the copy in the repository

0.2 (11 Nov 2004)
========================

- CS0 (alternative, nonreinforced CS) in acquisition-of-new-response phase (as control for sensory reinforcement)
- option to present CS0 in noncontingent phase (to reduce its novelty for acq-new-resp phase)
- option to have each CS flashing yes/no (with appropriate other params controlling flash timing)
- discrimination stimulus conditioning schedule

0.3 (21 Feb 2005)
========================

NOTES ON ZIMMERMAN STUDIES

- Zimmerman DW (1957). Durable secondary reinforcement: method and theory. Psych. Rev. 64: 373-383.

  1. Training I: thirsty rat. Present 2s buzzer then operate water dipper (at 2-minute intervals).
     Independent of rat's behaviour (except that buzzer never sounds when rat near dipper).
	 Train until reliable approach.

  2. Training II: omit water on alternate buzzer presentations, then increase number of omissions.
     End up with a mean of 1:10 (CS:US), with longest nonreinforced run being 1:14.
	 So buzzer signals "water *may* now be available"; never get water except after buzzer.
	 Approach continues reliably despite reduced reinforcment ratio.

  3. Testing: Allow rat to press bar for buzzer.
     For example: day 1: reinforce first 6 responses, then on FI-1min schedule.
	 "Various fixed-ratio schedules have also given satisfactory result."
	 "Intermittent reinforcement schedules of the kind just suggested will increase response output
	 *provided that*, on the basis of the prior training procedure (involving intermittent primary
	 reinforcement), the secondary reinforcer itself has been made "strong enough" to withstand a
	 schedule of this kind. If [it] is weak, as in the usual sort of secondary reinforcement experiment,
	 its effectiveness will wear out before any such schedule has a chance to operate."

	 Points to note:
	 * The CRf is discrete.
	 * The CRf has an independently observable effect on behaviour before the test procedure begins.
	 * Water never appears without being preceded by the stimulus, but the stimulus does not ensure
	   the delivery of water.
	 * The intermittent-reinforcement ratios used (both CS:US and response:CS) are quite high.
	   Requires gradual approximation.
	 * The reinforcing effect is large enough to be easily apparent in a single animal.
	 * With such a high ratio, in the acquisition-of-new-response phase, previous unreinforced bar presses
	   may themselves come to predict eventual CRf delivery, maintaining high rates of responding.

	 Comments on Dinsmoor (1950):
	 * Dinsmoor: any stimulus that is a CRf also has cue properties, and vice versa.
	 * Consequence: a stimulus that is "repeatedly and consistently" paired with primary reinforcement
	   presumably will *not* acquire secondary reinforcement properties unless, in addition, it has first
	   played a discriminative role of some kind, as a result of a differential reinforcement procedure.
	 * Zimmerman: "although the results are not as yet unequivocal, enough evidence now exists to justify the
	   user of a discrimination training procedure whenever an effort is being made to give a stimulus
	   secondary reinforcing properties."
	 * Suggests that secondary reinforcers are only effect if they already have some *response* conditioned
	   to them (in this case, approach to the water spout in response to the buzzer)...
	 * Or perhaps discrimination role necessary, but not sufficient... etc.

- Zimmerman DW (1959). Sustained performance in rats based on secondary reinforcement. J. Comp. Physiol. Psychol. 52: 353-358.

  Method in which the response released by the secondary reinforcer takes the animal into a wholly distinctive
  environment. Namely: train rats to run down alley for food. Then make this unpredictably and partially reinforced.
  Then use a buzzer as a "ready signal" to indicate that the rat will shortly be allowed to run down the alley
  for food. Then acquire a new response (bar pressing) in the start box for presentation of the ready signal
  and the opportunity to run. Control groups: (1) no bar-buzzer contingency; (2) no food in runway during training.

  So perhaps less good than the 1957 study as a clean acquisition-of-new-response procedure.

OUR PROTOTYPICAL CONDITIONED REINFORCEMENT STUDIES

- The typical "sucrose" paradigm:
  CS turns up intermittently, reinf delivered (only contingency is need to collect it).
  So Pavlovian/SD.

- Pat's method: establish stimulus as SD for cocaine IVSA.
  ?Acquisition of new response ?enhancement of behaviour in a second-order schedule by the stimulus

- Didn't work well: nosepoke -> {cocaine + stimulus}.
  Little conditioned reinforcement effect (Zimmerman wouldn't have been surprised, I suppose).
  ALM also noticed somewhat greater responding when the acq.-new-response (ANR) phase was VR1-3 than RR2,
  despite (on average) few lever-presses per CS in the RR condition, and equivalent prior training.
  But not great discrimination in either case.

SUGGESTED COCAINE METHOD (16 Feb 2005; meeting BJE/JL/ALM/RNC)

- Stimulus is SD for cocaine IVSA.
- Perhaps best with a within-session training method? That way, unpredictable even within a session when
  cocaine will be available: only information is the stimulus. Makes stimulus maximally informative.
  So if we follow Zimmerman's technique:

  ? long training stimulus

  EARLY SESSIONS:		periods when stimulus on: nosepoke -> FR1 -> cocaine
						periods when stimulus off: nosepoke -> nothing

						practical note for cocaine: must make stimulus periods, or more particularly the
						no-stimulus periods, long enough that cocaine washes out fully and then some.
						(BJE says washout ~5min, with FR1 cocaine IVSA about once every ~4 min. So
						perhaps make no-stimulus periods ~20 min or a little more; so perhaps have
						stimulus periods this length or a bit shorter.)

  LATE SESSIONS:		periods when stimulus on: nosepoke -> INTERMITTENT schedule -> cocaine
						periods when stimulus off: nosepoke -> nothing.

  ? short training stimulus

  PERHAPS EVEN BETTER:  stimulus is very brief. How about this:

		EARLY SESSIONS:	stimulus turns up every so often, doesn't hang around long,
						if the animal nosepokes when it's on then it gets cocaine.
						(Stimulus stays on until animal nosepokes, to facilitate training?)

		LATE SESSIONS:	stimulus turns up every so often, is similarly brief,
						nosepokes MAY be reinforced (have worked up to a high-ratio schedule)

  So work up to making the stimulus an SD that signals that cocaine *may* be available for a nosepoke.

  ANR PHASE:			Ideal: lever 1 -> CRf (brief - either an abbreviated version of the training stimulus,
                                               or the whole training training stimulus, if it were trained in the
											   "brief" method)
						       lever 2 -> another stimulus (defends against "sensory reinforcement" idea)
							              (best to habituate that stimulus a bit)
										  (if using this method, then can have CRf over active lever)

						Less ideal: lever 1 -> CRf, lever 2 -> nothing.
						Relies on past work to prove that stimulus-drug association is critical.
						Means CRf shouldn't be over active lever to defend against accusation of autoshaped responding.

		SCHEDULE:		start FR1 for a bit, work up, like Zimmerman did?


	(1) the facility for a nosepoke to turn the SD off (after X seconds);
	(2) the option to have many/none/many SDs, alternating in phases (a meta-schedule of SD presentation);
	(3) other schedules for the ANR phase.

- ALM IDEAS (18 Feb 2005)

  *Session 1* Rat sits in box with no DS on. He nosepokes, gets nothing. DS comes on (quite long to start with,
  but we'll reduce the length of the DS as they have more training sessions, until ideally the rats only have 5
  seconds to nosepoke once it appears). On nosepoking, the DS stays on for 20 seconds, as the cocaine is infused.
  The rat then has a time out period*.

    *We can't decide whether it would be best to have the timeout so that the time until the next presentation of
	the DS depends upon the time since the last nosepoke - e.g. VI 120 seconds, or whether to have an absolute time
	out period from the offset of the DS (somewhere between 2 and 5 minutes). With the VI version the rat is punished
	for premature nosepokes (so I figure that learning should hopefully be faster) but it does have the potential to
	prolong the session if the rat really doesn't understand that he is being punished. Which do you think would
	work best?

  After a few opportunities to get cocaine (probably 3) the rat then goes through a set length time out period,
  somewhere between 8 and 15 minutes. (Any advice?) Once he has gone through that, the DS is presented and the cycle
  starts over again. We'd ideally like between 20 and 30 infusions of cocaine per session, but without the session
  getting too long.

  *As they start to show good discrimination*
  We reduce the 15 minute(?) time out period until we ultimately remove it. 
  *Once they are discriminating well* Instead of giving cocaine on a FR1 schedule, we'll eventually put them onto
  VR1-3. (Are there any transitions we need to go through, e.g. VR2?) Ideally they would be good enough by this point
  that their training sessions are only about an hour long.

  After that, we put them into a reactivation session (essentially same as training but 15 minutes long and saline
  substituted for cocaine). The ANR would be pretty much as before, except that I will swap the active and inactive
  levers for my current rats (I'll be changing the side of the light DS from what they had before, before they go
  into training). The light will still be above the active lever, and I'll use the VR1-3 schedule on the active
  lever. My measures would be lever pressing (as usual) in the ANR phase, and nosepokes during / outside DS
  presentation in the training phase.

- BJE

  ... I think a key element of this - apart from getting a strong Crft effect of course - is not having too long in
  training.  This is not least because Joff's current expts indicate that with prolonged training, a treatment that
  previously blocked reonsolidation no longer does so.  I'm not at all sure about taking rats up to VR10 in the ANR
  phase, but who knows what they will do?  I am still a bit puzzled about why this procedure has gone off the boil;
  my guess is there is something in the original and present method that is not quite the same and it always
  fascinates me how this creeps into what are supposed to be established procedures.  But the training below
  looks good to me - save the requirement that we try to get rats 'trained' in not much more time than now and
  in sessions that are not too long.

- RNC 21 Feb 2005

  Sounds basically fine to me. I would have thought that in answer to your first question - should the SD be presented
  only when a certain time has elapsed since the last nosepoke (a DRL schedule) - should be no; although that clearly
  does punish inappropriate nosepoking that'll probably make it quite hard to train (that'd be my guess, anyway).

  I think to get this working I need to implement
	(1) the facility for a nosepoke to turn the SD off (after X seconds);
	(2) the option to have many/none/many SDs, alternating in phases (a meta-schedule of SD presentation);
	    - present SD on RT schedule for Y min, then none for Z min, then back to Y...
	(3) other schedules for the ANR phase.


0.4 (5 May 2005)
========================

- for Florence: option to have initial phase of session with houselight on, no levers, nosepokes recorded
  but with no consequence, for defined amount of time (default 10 min). Call it the "prequel" phase.
  m_bGivePrequel and m_fPrequelLength_min.

1.0 (7 March 2007)
========================

- improved ease of user compilation

2.0 (12 Jan 2009)
========================
- Server default changed from "loopback" to "localhost" (Windows Vista compatibility and more general standardization).

2.1 (14 Apr 2015)
========================
- Rebuild to use WhiskerClientLib 4.62 with new socket code.
- Define WINVER as 0x500.
- Compile cleanly with full warnings.


THINGS TO DO
========================