*=====================================================================. *LAB COMPUTER Syntax for Lab 4 of Quantitative Methods 1. *24 Oct 05, v7. *=====================================================================. ********************************************* NOTE *******************************************************. *** The macro programs needed for the commands used in this lab are pasted at the end of this file ****. ********************************************* NOTE *******************************************************. *=====================================================================. *4.2.1 Exercise 4.4.1 Pooled Variance CI for the difference between 2 Means (Pryce, p.4-16). *=====================================================================. *Suppose the mean height of girls, in your sample of ten, equals 100 cm (s.d. . *= 30cm), and the mean height of 12 boys is 94cm (sd = 31cm).  Calculate the . *95% confidence interval for the difference in population means assumimg . *homogenous variances.. *. CI_S2Mp n1=(10) n2=(12) x_bar1=(100) x_bar2=(94) s1=(30) s2=(31) c=(.95). *CI for the difference between 2 population means (pooled variance). * SAMPDIFF SP TiL SE err Lower Upper. * 6.00000 30.55405 -2.08596 13.08246 27.28954 -21.28954 33.28954. *The interval for the CI of the difference spans zero, so we cannot say with . *any certainty that a difference does exist in the population. *=====================================================================. *4.2.2 Exercise 4.4.2 Heterog. Variance CI for the difference between 2 Means (Pryce, p.4-17). *=====================================================================. *. *Run the heterogeneous variance CI method on the child height example above. CI_S2Md n1=(10) n2=(12) x_bar1=(100) x_bar2=(94) s1=(30) s2=(31) c=(.95). *CI for the difference between 2 population means (different variances). * SAMPDIFF TiL SE err Lower Upper. * 6.00000 -2.26216 13.04160 29.50215 -23.50215 35.50215. *=====================================================================. *4.2.3 Exercise 4.6 Calculating Confidence Intervals for Proportions (Pryce, Chapter 4):. *=====================================================================. *Consider the following email from a Consultant Paediatrician and the . *associated Powerpoint slide (based on a real-life statistical problem).  The . *data refer to the number deaths of children with Leukaemia (numerator) over . *the total number in the sample (denominator).  There are four samples: two . *in the "early" period (samples taken of children with Leukaemia in 1995-98) . *and two in the "late" period  (samples taken of children with Leukaemia in . *1998-2002).  The first sample in each period is of children given 5 courses . *of chemotherapy. The second sample in each period is of children given 4 . *courses of chemotherapy.  You would think that 5 courses of chemotherapy . *would result in fewer deaths than 4 courses, but the picture seems less . *straightforward and so rather puzzling.  To solve Dr X's problem, answer the . *following questions:. *Deaths/Patients . *Period of Study 5 courses 4 courses . *Early 41/240 66/240 . *Late 153/375 106/379 . *Subtotal: 194/615 172/619 . *=====================================================================. *(i) Derive a confidence interval for the proportion of deaths of . *children who underwent 5 courses in the Early period.. *=====================================================================. CI_L1P n=(240) x=(41) c=(.95). *Traditional Large sample CI for one proportion. * ptrad zstar se_trad etrad low_trad up_trad. * .170833 -1.959964 .024294 .047616 .123218 .218449. *Wilson Large sample CI for one proportion. * pwlsn zstar se_wlsn ewlsn low_wlsn up_wlsn. * .176230 -1.959964 .024392 .047807 .128422 .224037. *=====================================================================. *(ii) Derive a confidence interval for the proportion of deaths of . *children who underwent 4 courses in the Early period and compare with (i).. *=====================================================================. CI_L1P n=(240) x=(66) c=(.95). *Traditional Large sample CI for one proportion. * ptrad zstar se_trad etrad low_trad up_trad. * .275000 -1.959964 .028822 .056491 .218509 .331491. *Wilson Large sample CI for one proportion. * pwlsn zstar se_wlsn ewlsn low_wlsn up_wlsn. * .278689 -1.959964 .028703 .056257 .222432 .334945. *Note that the 95% confidence intervals for the two sets of courses in the early period overlap:. * Five courses in the early period (sample proportion = 17.08%) 95% CI = (12.8%, 22.4%). * Four courses in the early period (sample proportion = 27.50%) 95% CI = (22.2%, 33.5%). *So, the difference between the two sample proportions may simply be due to sampling variability. *=====================================================================. *(iii) Derive a confidence interval for the proportion of deaths of . *children who underwent 5 courses in the Late period.. *=====================================================================. CI_L1P n=(375) x=(153) c=(.95). *Traditional Large sample CI for one proportion. * ptrad zstar se_trad etrad low_trad up_trad. * .408000 -1.959964 .025379 .049742 .358258 .457742. *Wilson Large sample CI for one proportion. * pwlsn zstar se_wlsn ewlsn low_wlsn up_wlsn. * .408971 -1.959964 .025254 .049497 .359474 .458468. *=====================================================================. *(iv) Derive a confidence interval for the proportion of deaths of . *children who underwent 4 courses in the Late period and compare with (iii).. *=====================================================================. CI_L1P n=(379) x=(106) c=(.95). *Traditional Large sample CI for one proportion * ptrad zstar se_trad etrad low_trad up_trad * .279683 -1.959964 .023056 .045188 .234495 .324871 *Wilson Large sample CI for one proportion * pwlsn zstar se_wlsn ewlsn low_wlsn up_wlsn * .281984 -1.959964 .022992 .045064 .236920 .327048 *The 95% confidence intervals for the two sets of courses in the late period do NOT overlap . *which means that the difference cannot be ascribed to sampling variation alone:. * Five courses in the late period (sample proportion = 40.80%) 95% CI = (35.9%, 45.9%). * Four courses in the late period (sample proportion = 27.97%) 95% CI = (23.7%, 32.7%). *=====================================================================. *(v) Derive a confidence interval for the proportion of deaths of . *children who underwent 5 courses in the combined periods.. *=====================================================================. CI_L1P n=(615) x=(194) c=(.95). *Traditional Large sample CI for one proportion. * ptrad zstar se_trad etrad low_trad up_trad. * .315447 -1.959964 .018738 .036726 .278721 .352173. *Wilson Large sample CI for one proportion. * pwlsn zstar se_wlsn ewlsn low_wlsn up_wlsn. * .316640 -1.959964 .018697 .036645 .279995 .353284. *=====================================================================. *(vi) Derive a confidence interval for the proportion of deaths of . *children who underwent 4 courses in the combined periods and compare with . *=====================================================================. CI_L1P n=(619) x=(172) c=(.95). *Traditional Large sample CI for one proportion. * ptrad zstar se_trad etrad low_trad up_trad. * .277868 -1.959964 .018005 .035288 .242579 .313156. *Wilson Large sample CI for one proportion. * pwlsn zstar se_wlsn ewlsn low_wlsn up_wlsn. * .279294 -1.959964 .017975 .035230 .244064 .314524. *Again, note that the 95% confidence intervals for the two sets of courses in the combined periods also overlap:. *Five courses in the combined periods (sample proportion = 31.55%) 95% CI = (35.9%, 45.9%). *Four courses in the combined periods (sample proportion = 27.79%) 95% CI = (23.7%, 32.7%). *Dr X would need much larger samples to derive the narrower confidence intervals . *needed to identify the true differences in the proportions between different treatments. *Summary:. *Where two intervals overlap, there is no significant difference between the proportions . *(that is, we cannot say with any degree of confidence that the difference isn’t simply due to random sampling variability). *What is odd here, is that there appears to be a significant difference between the early and late 5 course proportions: . *the confidence intervals are far apart and so it is highly unlikely that the difference between the death rates is due to sampling variability only. *=====================================================================. *(b) Additional Exercise for those with a medical interest: . *Differences Among Outcome Measures in Occupational Low Back Pain. *=====================================================================. *Ferguson et al (2005) note that, "Low back pain recurrence rates have been reported as high as 70%; . *however, these rates vary greatly depending on the definition of recurrence (1–6). *The high rate of recurrent low back pain as well as variability suggests that . *we do not have good understanding of low back pain recovery. *Examining the various outcome measures that have been used in the past . *and developing our understanding of the relationship among them . *may provide insight as to why recurrence rates are so high." . *They construct a cross-sectional survey of 208 workers who have returned to work . *after a work-related episode of low back pain, and compare different outcome measures . *of recurring lower back pain after returning to work. *They find apparent differences in the percentage of subjects recovered for different outcome measures, . *as summarised in the table below . *(x refers to the number of subjects who have recovered according to each criteria, . *and p gives x as a proportion of the sample size, n). * Criteria used to define “recovery” x p. *(a) full duty return to work 206 0.99. *(b) activities of daily living 52 0.25. *(c) symptoms 35 0.17. *(d) functional performance probability 26 0.125. *(e) motion 123 0.59. *(f) velocity 27 0.13. *(g) acceleration 21 0.10. * n 208 . *Source: Ferguson, S.A. et al (2005) Differences Among Outcome Measures in Occupational Low Back Pain, . *Journal of Occupational Rehabilitation, 15(3) 329 - 341. *(i) Compare proportions (a) and (e) at the 95% confidence level. *(ii) Compare proportions (b) and (c) at the 95% confidence level. *(iii) Compare proportions (c) and (d) at the 95% confidence level. *(iv) Compare proportions (f) and (g) at the 95% confidence level. *Comment on your results in each case. *=====================================================================. *(i) Compare proportions (a) and (e) at the 95% confidence level. *=====================================================================. *(a). CI_L1P n=(208) x=(206) c=(.95). *Traditional Large sample CI for one proportion. * ptrad zstar se_trad etrad low_trad up_trad. * .990385 -1.959964 .006766 .013262 .977123 1.003646. *Wilson Large sample CI for one proportion. * pwlsn zstar se_wlsn ewlsn low_wlsn up_wlsn. * .981132 -1.959964 .009345 .018315 .962817 .999447. *Notice that the upper bound of the traditional CI estimate exceeds 1, which is meaningless. *(e). CI_L1P n=(208) x=(123) c=(.95). *Traditional Large sample CI for one proportion. * ptrad zstar se_trad etrad low_trad up_trad. * .591346 -1.959964 .034085 .066806 .524540 .658152. *Wilson Large sample CI for one proportion. * pwlsn zstar se_wlsn ewlsn low_wlsn up_wlsn. * .589623 -1.959964 .033784 .066215 .523407 .655838. *The confidence intervals do not overlap so there is a significant difference between the (a) and (e) proportions. *=====================================================================. *(ii) Compare proportions (b) and (c) at the 95% confidence level. *=====================================================================. *(b). CI_L1P n=(208) x=(52) c=(.95). *Traditional Large sample CI for one proportion. * ptrad zstar se_trad etrad low_trad up_trad. * .250000 -1.959964 .030024 .058846 .191154 .308846. *Wilson Large sample CI for one proportion. * pwlsn zstar se_wlsn ewlsn low_wlsn up_wlsn. * .254717 -1.959964 .029924 .058650 .196067 .313367. *(c). CI_L1P n=(208) x=(35) c=(.95). *Traditional Large sample CI for one proportion. * ptrad zstar se_trad etrad low_trad up_trad. * .168269 -1.959964 .025940 .050841 .117429 .219110. *Wilson Large sample CI for one proportion. * pwlsn zstar se_wlsn ewlsn low_wlsn up_wlsn. * .174528 -1.959964 .026069 .051093 .123435 .225622. *The confidence intervals for (b) and (c) overlap so there is no . *significant different between these proportions. *=====================================================================. *(iii) Compare proportions (c) and (d) at the 95% confidence level. *=====================================================================. *(c). CI_L1P n=(208) x=(35) c=(.95). *see results above. *(d). CI_L1P n=(208) x=(26) c=(.95). *Traditional Large sample CI for one proportion. * ptrad zstar se_trad etrad low_trad up_trad. * .125000 -1.959964 .022931 .044944 .080056 .169944. *Wilson Large sample CI for one proportion. * pwlsn zstar se_wlsn ewlsn low_wlsn up_wlsn. * .132075 -1.959964 .023253 .045576 .086500 .177651. *The confidence intervals for (c) and (d) also overlap so there is no . *significant different between these proportions. *=====================================================================. *(iv) Compare proportions (f) and (g) at the 95% confidence level. *=====================================================================. *(f). CI_L1P n=(208) x=(27) c=(.95). *Traditional Large sample CI for one proportion. * ptrad zstar se_trad etrad low_trad up_trad. * .129808 -1.959964 .023304 .045675 .084133 .175482. *Wilson Large sample CI for one proportion. * pwlsn zstar se_wlsn ewlsn low_wlsn up_wlsn. * .136792 -1.959964 .023600 .046256 .090536 .183049. *(g). CI_L1P n=(208) x=(21) c=(.95). *Traditional Large sample CI for one proportion. * ptrad zstar se_trad etrad low_trad up_trad. * .100962 -1.959964 .020890 .040943 .060018 .141905. *Wilson Large sample CI for one proportion. * pwlsn zstar se_wlsn ewlsn low_wlsn up_wlsn. * .108491 -1.959964 .021359 .041864 .066627 .150354. *The confidence intervals for (f) and (g) almost entirely overlap so there is no . *significant different between these proportions. The intervals would still overlap. *even if we accepted a much lower level of confidence. *=====================================================================. *4.2.4 Example 4.7 Using SPSS to calculate CIs when you hav the original data (Pryce, p.4-23). *=====================================================================. *. *1.            Open up the data on house prices and run the following EXAMINE . *syntax on purchase price (note the /PLOT NONE  /CINTERVAL 95  qualifiers . *which suppress the graph output and specify the confidence level . *respectively). Now check how this interval compares to that obtained using . *our macro method.. GET 'Q:\QUANTS\Glasgow_houseprices_pop_2004q3q4.sav'. EXAMINE VARIABLES = sellingprice /PLOT NONE /CINTERVAL 95 . CI_S1M n = (3731) x_bar = (109229.5) s = (63557.35) c = (0.95). *Small sample confidence interval for the population mean. * n x_bar TiL SE err Lower Upper. * 3731.00000 109229.500 -1.96060 1040.52653 2040.05650 107189.443 111269.557. *2.            You have a sample of 83 observations on income as listed in . *the income.sav data file (which you can obtain from the Downloads page of . *www.geebeejey.co.uk or from the Q: drive of the Faculty labs).  Compute (a) . *a 90% confidence interval for income; and (b) a 90% confidence interval for . *the proportion of households with incomes of  £80,000 and over. *(a). GET 'Q:\QUANTS\income.sav'. EXAMINE VARIABLES = income /PLOT NONE /CINTERVAL 90 . CI_S1M n = (83) x_bar = (35887.22) s = (17457.15) c = (0.90). *Small sample confidence interval for the population mean. * n x_bar TiL SE err Lower Upper. * 83.00000 35887.2200 -1.66365 1916.17115 3187.83657 32699.3834 39075.0566. *(b). GET FILE='Q:\QUANTS\income.sav'. RECODE income (Lowest thru 79999.9999=0) (80000 thru Highest=1) (ELSE=SYSMIS) INTO incover80K. VARIABLE LABELS incover80K 'Income over 80K'. EXECUTE . EXAMINE VARIABLES = incover80K /PLOT NONE /CINTERVAL 90 . *There is clearly a problem here. The lower bound on the confidence interval is negative! . *To say that we are 95% certain that the proportion of households in the population . *with an income over £80,000 lies between –1.19% and 3.6% is meaningless . *since we know that a proportion cannot be negative. *This spurious result arises because SPSS does not know that you are dealing with a proportion . *rather than a mean and therefore does not apply the correct formula for calculating the confidence interval. *A more meaningful result is achieved if we run frequencies on our dummy variable . *to find the number of households in our sample with incomes of at least £80,000, . FREQUENCIES VARIABLES=incover80K . *Income over 80K. * Frequency Percent Valid Percent Cumulative Percent. *Valid .00 82 98.8 98.8 98.8. * 1.00 1 1.2 1.2 100.0. * Total 83 100.0 100.0 . *and then use the appropriate macro syntax to compute a 90% confidence interval for a proportion:. CI_L1P n=(83) x=(1) c=(.90). *Traditional Large sample CI for one proportion. * ptrad zstar se_trad etrad low_trad up_trad. * .012048 -1.644854 .011975 .019698 -.007650 .031746. *Wilson Large sample CI for one proportion. * pwlsn zstar se_wlsn ewlsn low_wlsn up_wlsn. * .034483 -1.644854 .019562 .032177 .002306 .066660. *=====================================================================. *4.2.5 Exercise 4.7 Using the GRAPH/ERRORBAR command to compare CIs (Pryce, p.4-25). *=====================================================================. *=====================================================================. *1. Auctions in Durham and Cumberland.           *=====================================================================. *Open the auction.sav data set.  The file records the estimated . *and final sale prices of 100 items entered for auction at venues in . *Cumberland and Durham. Use the EXAMINE VARIABLES command to compare the 95% . *confidence intervals of the auctioneer's estimated value (value) and the . *actual purchase price (purchase).  Now compare the two variables using the . *GRAPH /ERRORBAR  command.  Does the graph confirm what you found from using . *the 'Explore' function?  What happens if you change the confidence level to . *80%?  Explain your findings.. *. *. *. *=====================================================================. *2. Auctions in Durham and Cumberland: 95% CI. *=====================================================================. *Compare the 95% confidence intervals for the purchase . *price of lots entered in Durham vs those entered in Cumberland.  *What do your results tell you?. GET FILE='Q:\QUANTS\auction.sav'. EXAMINE VARIABLES = purchase value /PLOT NONE /CINTERVAL 95 . GRAPH /ERRORBAR( CI 95 )=purchase value . EXAMINE VARIABLES = purchase value /PLOT NONE /CINTERVAL 80 . GRAPH /ERRORBAR( CI 80 )=purchase value . GRAPH /ERRORBAR( CI 95 )=purchase BY zcounty /MISSING=REPORT. GRAPH /ERRORBAR( CI 80 )=purchase BY zcounty /MISSING=REPORT. EXAMINE VARIABLES=purchase BY zcounty /PLOT NONE /CINTERVAL 80 . *Note that you can compare variables and groups all on the same graph. For example,. GRAPH /ERRORBAR( CI 95 )=purchase value BY zcounty /TITLE= 'Comparison of Confidence Intervals for the Mean' 'Purchase price and valuation' /SUBTITLE= 'Cumberland and Durham compared' /FOOTNOTE= 'Source: Acme Auction Data'. *Thus, we have managed to display on one graph what the two previous graphs showed separately:. *that although the confidence intervals still overlap for the two areas, . *they do so to a lesser extent than for the comparison between areas . *than for the comparison between purchase and value. *=====================================================================. *4.2.6 Exercise 4.8.2 Sample Size Determination. *=====================================================================. *For your PhD, you want to estimate the mean hourly wage rate of unskilled . *labour in Easterhouse within ±£0.10 at the 99% confidence level.  A 1987 . *study (large sample size) by the Department of Employment resulted in a . *standard deviation of £0.85.  Using this as an approximation for s, compute . *the necessary sample size to arrive at the desired level of accuracy. N_L1M e=(0.1) c=(0.99) s=(0.85) . *n_hat = estimated sample size needed to achieve an error of size e given c. * e c ziL ziU n_hat. * .10000 .99000 -2.57583 2.57583 479.37128. *These results suggest that a sample size of 480 is needed to achieve the desired level of accuracy. *=====================================================================. *End of exercises. *=====================================================================. *#############################################################################. *#############################################################################. *#############################################################################. *#############################################################################. *#############################################################################. *#############################################################################. *=====================================================================. *Macro Programs. *=====================================================================. *If these macros have not already been installed on the lab machines, simply highlight all the programs below. *Then run them as one command by pressing CTRL+R. *You will then be able to use macro commands . *---- Highlight from the start of this line... -------------------------------------------. DEFINE pz_lt_zi (!POSITIONAL !ENCLOSE('(',')')). GET FILE='Q:\QUANTS\one.sav'. compute Zi_Var = !1 . COMPUTE PROB = CDFNORM(Zi_Var). execute. MATRIX. GET PROB_VAR /VARIABLES = PROB. GET Zi_Var /VARIABLES = Zi_Var. COMPUTE Prob = PROB_VAR(1). COMPUTE zi = Zi_Var(1). COMPUTE ANSWER = {zi, PROB}. PRINT ANSWER / FORMAT "F10.5" /Title = " Prob(z < zi) for a given zi " / CLABELS = zi, Prob. END MATRIX. !ENDDEFINE. DEFINE pz_gt_zi (!POSITIONAL !ENCLOSE('(',')')). GET FILE='Q:\QUANTS\one.sav'. compute Zi_Var = !1 . COMPUTE PROB = 1 - CDFNORM(Zi_Var). execute. MATRIX. GET PROB_VAR /VARIABLES = PROB. GET Zi_Var /VARIABLES = Zi_Var. COMPUTE Prob = PROB_VAR(1). COMPUTE zi = Zi_Var(1). COMPUTE ANSWER = {zi, PROB}. PRINT ANSWER / FORMAT "F10.5" /Title = " Prob(z > zi) for a given zi " / CLABELS = zi, Prob. END MATRIX. !ENDDEFINE. DEFINE pz_lg_zi (zil = !ENCLOSE('(',')') / ziu = !ENCLOSE('(',')')). GET FILE='Q:\QUANTS\one.sav'. compute ZiL_Var = !zil . compute ZiU_Var = !ziu . execute. COMPUTE PROBL = CDFNORM(ZiL_Var). COMPUTE PROBU = 1 - CDFNORM(ZiU_Var). COMPUTE PROBLG = PROBL + PROBU. execute. MATRIX. GET PROB_VAR /VARIABLES = PROBLG. GET ZiL_Var /VARIABLES = ZiL_Var. GET ZiU_Var /VARIABLES = ZiU_Var. COMPUTE Prob = PROB_VAR(1). COMPUTE ziL = ZiL_Var(1). COMPUTE ziU = ZiU_Var(1). COMPUTE ANSWER = {ziL, ziU, PROB}. PRINT ANSWER / FORMAT "F10.5" /Title = " Prob((z < ziL) OR (z > ziU)) for a given zi " / CLABELS = ziL, ziU, Prob. END MATRIX. !ENDDEFINE. DEFINE pz_gl_zi (zil = !ENCLOSE ('(',')') / ziu = !ENCLOSE('(',')')). GET FILE='Q:\QUANTS\one.sav'. compute ZiL_Var = !zil . compute ZiU_Var = !ziu . execute. COMPUTE PROBL = CDFNORM(ZiL_Var). COMPUTE PROBU = 1 - CDFNORM(ZiU_Var). COMPUTE PROBLG = 1 - (PROBL + PROBU). execute. MATRIX. GET PROB_VAR /VARIABLES = PROBLG. GET ZiL_Var /VARIABLES = ZiL_Var. GET ZiU_Var /VARIABLES = ZiU_Var. COMPUTE Prob = PROB_VAR(1). COMPUTE ziL = ZiL_Var(1). COMPUTE ziU = ZiU_Var(1). COMPUTE ANSWER = {ziL, ziU, PROB}. PRINT ANSWER / FORMAT "F10.5" /Title = " Prob(ziL < z < ziU) for a given zi " / CLABELS = ziL, ziU, Prob. END MATRIX. !ENDDEFINE. DEFINE zi_lt_zp (p = !ENCLOSE('(',')')). GET FILE='Q:\QUANTS\one.sav'. COMPUTE Zi = PROBIT(!p). EXECUTE. MATRIX. GET Zi_VAR /VARIABLES = Zi. COMPUTE Zi = Zi_VAR(1). COMPUTE PROB= {!p}. /*Enter the given probability into the curly brackets*/ COMPUTE ANSWER = {Zi, PROB}. PRINT ANSWER / FORMAT "F10.5" /Title = " Value of zi such that Prob(z < zi) = PROB when PROB is given" / CLABELS = zi, PROB. END MATRIX. !END DEFINE. DEFINE zi_gt_zp (p = !ENCLOSE('(',')')). GET FILE='Q:\QUANTS\one.sav'. COMPUTE Zi = PROBIT(1-!p). EXECUTE. MATRIX. GET Zi_VAR /VARIABLES = Zi. COMPUTE Zi = Zi_VAR(1). COMPUTE PROB= {!p}. /*Enter the given probability into the curly brackets*/ COMPUTE ANSWER = {Zi, PROB}. PRINT ANSWER / FORMAT "F10.5" /Title = " Value of zi such that Prob(z > zi) = PROB when PROB is given" / CLABELS = zi, PROB. END MATRIX. !END DEFINE. DEFINE zi_gl_zp (p = !ENCLOSE('(',')')). GET FILE='Q:\QUANTS\one.sav'. COMPUTE PROB = !p. COMPUTE PROBLG = 1 - !p. COMPUTE PROBL = PROBLG / 2. COMPUTE ZiL_Var = PROBIT(PROBL). COMPUTE ZiU_Var = -1 * ZiL_Var . execute. MATRIX. GET PROB_VAR /VARIABLES = PROB. GET ZiL_Var /VARIABLES = ZiL_Var. GET ZiU_Var /VARIABLES = ZiU_Var. COMPUTE Prob = PROB_VAR(1). COMPUTE ziL = ZiL_Var(1). COMPUTE ziU = ZiU_Var(1). COMPUTE ANSWER = {ziL, ziU, PROB}. PRINT ANSWER / FORMAT "F10.5" /Title = " Value of zi such that Prob(-zi < z < zi) = PROB, when PROB is given " / CLABELS = ziL, ziU, Prob. END MATRIX. !ENDDEFINE. DEFINE CI_L1M (n = !ENCLOSE('(',')') /x_bar = !ENCLOSE('(',')') /s = !ENCLOSE('(',')') /c = !ENCLOSE('(',')')). GET FILE='Q:\QUANTS\one.sav'. COMPUTE PROB = !c. COMPUTE PROBLG = 1 - !c. COMPUTE PROBL = PROBLG / 2. COMPUTE ZiL = PROBIT(PROBL). COMPUTE ZiU = -1 * ZiL . execute. MATRIX. COMPUTE n = {!n}. /* Enter the sample size here (i.e. change the number in curly brackets)*/ COMPUTE x_bar = {!x_bar}. /* Enter the sample mean here*/ COMPUTE s = {!s}. /* Enter the sample standard deviation here*/ COMPUTE SE = s/SQRT(n). GET ZiL /VARIABLES = ZiL. COMPUTE ERR = -ZiL * SE. COMPUTE LOWER = x_bar - err. COMPUTE UPPER = x_bar + err. COMPUTE ANSWER = {n, x_bar, ZiL, SE, err, Lower, Upper}. PRINT ANSWER / FORMAT "F10.5" /Title = "Large sample confidence interval for the population mean" / CLABELS = n, x_bar, ZiL, SE, err, Lower, Upper. END MATRIX. !END DEFINE. DEFINE CI_S1M (n = !ENCLOSE('(',')') /x_bar = !ENCLOSE('(',')') /s = !ENCLOSE('(',')') /c = !ENCLOSE('(',')')). GET FILE='Q:\QUANTS\one.sav'. COMPUTE df = !n - 1. COMPUTE PROB = !c. COMPUTE PROBLG = 1 - !c. COMPUTE PROBL = PROBLG / 2. COMPUTE TiL = IDF.T(PROBL, df). COMPUTE TiU = -1 * TiL . execute. MATRIX. COMPUTE n = {!n}. /* Enter the sample size here (i.e. change the number in curly brackets)*/ COMPUTE x_bar = {!x_bar}. /* Enter the sample mean here*/ COMPUTE s = {!s}. /* Enter the sample standard deviation here*/ COMPUTE SE = s/SQRT(n). GET TiL /VARIABLES = TiL. GET df /VARIABLES = df. COMPUTE ERR = -TiL * SE. COMPUTE LOWER = x_bar - err. COMPUTE UPPER = x_bar + err. COMPUTE ANSWER = {n, x_bar, TiL, SE, err, Lower, Upper}. PRINT ANSWER / FORMAT "F10.5" /Title = "Small sample confidence interval for the population mean" / CLABELS = n, x_bar, TiL, SE, err, Lower, Upper. END MATRIX. !END DEFINE. DEFINE CI_S2Mp (n1 = !ENCLOSE('(',')') /n2 = !ENCLOSE('(',')') /x_bar1 = !ENCLOSE('(',')') /x_bar2 = !ENCLOSE('(',')') /s1 = !ENCLOSE('(',')') /s2 = !ENCLOSE('(',')') /c = !ENCLOSE('(',')')). GET FILE='Q:\QUANTS\one.sav'. COMPUTE df = !n1 + !n2 - 2. COMPUTE PROB = !c. COMPUTE PROBLG = 1 - !c. COMPUTE PROBL = PROBLG / 2. COMPUTE TiL = IDF.T(PROBL, df). COMPUTE TiU = -1 * TiL . execute. MATRIX. GET df / variables = df. /* Enter the df here (i.e. change the number in curly brackets)*/ COMPUTE x_bar1 = {!x_bar1}. /* Enter the sample mean here*/ COMPUTE x_bar2 = {!x_bar2}. /* Enter the sample mean here*/ COMPUTE sp = SQRT(( (!n1 - 1)* !s1**2 + (!n2 - 1) * !s2**2 ) / (!n1 + !n2 - 2) ). COMPUTE SE = sp*(SQRT((1/!n1) + (1/!n2))). GET TiL /VARIABLES = TiL. GET df /VARIABLES = df. COMPUTE ERR = -TiL * SE. COMPUTE SAMPDIFF = x_bar1 - x_bar2. COMPUTE LOWER = SAMPDIFF - err. COMPUTE UPPER = SAMPDIFF + err. COMPUTE ANSWER = {SAMPDIFF, SP, TiL, SE, err, Lower, Upper}. PRINT ANSWER / FORMAT "F10.5" /Title = "CI for the difference between 2 population means (pooled variance)" / CLABELS = SAMPDIFF, SP, TiL, SE, err, Lower, Upper. END MATRIX. !END DEFINE. DEFINE CI_S2Md (n1 = !ENCLOSE('(',')') /n2 = !ENCLOSE('(',')') /x_bar1 = !ENCLOSE('(',')') /x_bar2 = !ENCLOSE('(',')') /s1 = !ENCLOSE('(',')') /s2 = !ENCLOSE ('(',')') /c = !ENCLOSE('(',')')). GET FILE='Q:\QUANTS\one.sav'. COMPUTE df = min((!n1 -1), (!n2 - 1)). COMPUTE PROB = !c. COMPUTE PROBLG = 1 - !c. COMPUTE PROBL = PROBLG / 2. COMPUTE TiL = IDF.T(PROBL, df). COMPUTE TiU = -1 * TiL . execute. MATRIX. GET df / variables = df. /* Enter the df here (i.e. change the number in curly brackets)*/ COMPUTE x_bar1 = {!x_bar1}. /* Enter the sample mean here*/ COMPUTE x_bar2 = {!x_bar2}. /* Enter the sample mean here*/ COMPUTE SE = SQRT((!s1**2/!n1) + (!s2**2/!n2)). GET TiL /VARIABLES = TiL. GET df /VARIABLES = df. COMPUTE ERR = -TiL * SE. COMPUTE SAMPDIFF = x_bar1 - x_bar2. COMPUTE LOWER = SAMPDIFF - err. COMPUTE UPPER = SAMPDIFF + err. COMPUTE ANSWER = {SAMPDIFF, TiL, SE, err, Lower, Upper}. PRINT ANSWER / FORMAT "F10.5" /Title = "CI for the difference between 2 population means (different variances)" / CLABELS = SAMPDIFF, TiL, SE, err, Lower, Upper. END MATRIX. !END DEFINE. DEFINE CI_L1P (n = !ENCLOSE ('(',')') /x = !ENCLOSE('(',')') /c = !ENCLOSE('(',')')). GET FILE='Q:\QUANTS\one.sav'. COMPUTE PROB = !c. COMPUTE PROBLG = 1 - !c. COMPUTE PROBL = PROBLG / 2. COMPUTE ZiL = PROBIT(PROBL). COMPUTE ZiU = -1 * ZiL . execute. MATRIX. COMPUTE n = !n. /* Enter the sample size here */ COMPUTE x = !x. /* Enter the number of "successes" or particular outcomes here */ COMPUTE CONFID = !c. /* Enter the desired confidence level here */ COMPUTE pTrad = x/n. /* the traditional estimate of the pop. proportion used in CI estimation */ COMPUTE pWlsn = (x+2)/(n + 4). /* the Wilson estimate */ GET zstar /VARIABLES = ZiL. COMPUTE SE_Trad = SQRT((pTrad*(1-pTrad))/n). COMPUTE SE_Wlsn = SQRT((pWlsn*(1-pWlsn))/(n+4)). COMPUTE eTrad = -zstar * SE_Trad. COMPUTE eWlsn = -zstar * SE_Wlsn. COMPUTE LOW_Trad = pTrad - eTrad. COMPUTE LOW_Wlsn = pWlsn - eWlsn. COMPUTE UP_Trad = pTrad + eTrad. COMPUTE UP_Wlsn = pWlsn + eWlsn. COMPUTE ANSWER = {pTrad, zstar, se_trad, etrad, low_trad, up_trad}. PRINT ANSWER / FORMAT "F10.6" /Title = "Traditional Large sample CI for one proportion" / CLABELS = ptrad, zstar, se_trad, etrad, low_trad, up_trad. COMPUTE ANSWER = {pWlsn, zstar, se_wlsn, ewlsn, low_wlsn, up_wlsn}. PRINT ANSWER / FORMAT "F10.6" /Title = "Wilson Large sample CI for one proportion" / CLABELS = pwlsn, zstar, se_wlsn, ewlsn, low_wlsn, up_wlsn. END MATRIX. !ENDDEFINE. DEFINE N_L1M (e = !ENCLOSE('(',')') /c = !ENCLOSE('(',')') /s = !ENCLOSE('(',')')). GET FILE='Q:\QUANTS\one.sav'. COMPUTE PROB = !c. COMPUTE E = !E. COMPUTE PROBLG = 1 - !c. COMPUTE PROBL = PROBLG / 2. COMPUTE ZiL_Var = PROBIT(PROBL). COMPUTE ZiU_Var = -1 * ZiL_Var . COMPUTE N = (ZiL_Var**2) * (!s**2) / (!e**2). execute. MATRIX. GET PROB_VAR /VARIABLES = PROB. GET N / VARIABLES = N. GET E / VARIABLES = E. GET ZiL_Var /VARIABLES = ZiL_Var. GET ZiU_Var /VARIABLES = ZiU_Var. COMPUTE Prob = PROB_VAR(1). COMPUTE ziL = ZiL_Var(1). COMPUTE ziU = ZiU_Var(1). COMPUTE ANSWER = {e, PROB, ziL, ziU, N}. PRINT ANSWER / FORMAT "F10.5" /Title = "n_hat = estimated sample size needed to achieve an error of size e given c" / CLABELS = e, c, ziL, ziU, n_hat. END MATRIX. !ENDDEFINE. *---- ... to the end of this line -------------------------------------------------------.