




版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請進(jìn)行舉報或認(rèn)領(lǐng)
文檔簡介
1、.Week Six Analyzing categorical data: Chi-squared tests .This week lecture will cover.Analysing categorical data (nominal) Chi-square test of differences between proportions Chi-square test of independence.SPSS單樣本非參數(shù)檢驗總體分布的總體分布的chi-square檢驗檢驗(1)目的目的: 根據(jù)樣本數(shù)據(jù)推斷總體的分布與某個已知分布是否有顯著差異根據(jù)樣本數(shù)據(jù)推斷總體的分布與某個已知分布是否
2、有顯著差異-吻合性檢驗。吻合性檢驗。適用于分類資料的統(tǒng)計推斷適用于分類資料的統(tǒng)計推斷.SPSS單樣本非參數(shù)檢驗單樣本非參數(shù)檢驗l總體分布的chi-square檢驗(2)基本假設(shè): H0:總體分布與理論分布無顯著差異(3)基本方法 根據(jù)已知總體的構(gòu)成比計算出樣本中各類別的期望頻數(shù),計算實際觀察頻數(shù)與期望頻數(shù)的差距,即:計算卡方值 卡方值較小,則實際頻數(shù)和期望頻數(shù)相差較小.如果P大于a,不能拒絕H0,認(rèn)為總體分布與已知分布無顯著差異.反之.SPSS單樣本卡方檢驗總體分布的總體分布的chi-square檢驗檢驗(4)基本操作步驟基本操作步驟:菜單:analyze-nonparametric test
3、-chi square選定待檢驗變量入test variable list 框確定待檢驗個案的取值范圍(expected range)get from data:全部樣本use specified range:用戶自定義個案范圍指定期望頻數(shù)(expected values)all categories equal:所有類別有相同的構(gòu)成比value:用戶自定義構(gòu)成比.Categorical variableVariables that describe categories of entitiesDealing with them all the time in statisticsMaking
4、 comparisons among variablesFor example, whether consumers prefer a particular brand of a product among other competing brands.Checking whether there is a relationship between two categorical variables Gender and preference for a product, whether the preference for a product is independent from gend
5、er.Chi-square test for differences between proportionsThis test involves with nominal data produced by multinomial experimentIt is a generalisation of a binomial experimentThese test the null hypothesis that data in the target population has a particular probability distribution.Example 1We might te
6、st whether consumers are indifferent to which of four materials (glass, plastic, steel or aluminium) that could be used to make soft drink containers.The null hypothesis is that they are indifferent (or that equal numbers prefer glass, plastic, steel and aluminium).Example 1DataLet pG be the probabi
7、lity that an individual selected at random will nominate glass as his/her preference if required to make a choice. Similarly for pP (plastic), pS (steel) and pA (aluminium)HypothesesHO: pG = pP = pS = pA = 0.25.HA: at least one pi 0.25.The alternative is that at least one material is more preferred
8、(or less preferred) than the others.Example 1cont.Procedure:Select a random sample of, say, 100 consumers and determine their preferences.Under the null hypothesisWe expect 25 consumers to nominate glass, 25 to nominate plastic, 25 to nominate steel and 25 to nominate aluminiumThese are the expected
9、 frequencies, Ei.Ei = n pi.We compare the expected frequencies with the sample results or the observed frequencies, Oi. If they are approximately the same we would conclude that the null hypothesis is true.Oi Ei HO is probably true.Example 1cont., Chi squareE)EO(i221GiiWe require a test statistic to
10、 decide whether the difference is large enough to reject the null hypothesis.We use chi square with G - 1 degrees of freedom where G is the number of groups.Suppose in our example, 39 prefer glass, 16 prefer plastic, 20 prefer steel and 25 prefer aluminium. Recall that the expected frequencies were
11、all 25.08.1225)2525(25)2520(25)2516(25)2539(23222223.Obtain the critical value of chi square Critical 23 = 7.82. Obtain the critical value at 5% significance level at 3 d.f., (Table E4, page 742, Berenson et.al. 2013)i.e. there is only a 5 percent chance or less that 23 7.82 if HO is true. Compariso
12、n of chi square values23 = 12.08 7.82 reject HO. Conclusion: at the 5% significance level there is sufficient evidence to reject the null hypothesis. At least one of the probabilities (pi) is different. The sample results indicate that the materials are not equally preferred by consumers in the targ
13、et population. Thus, at least preferences for two materials are different.Chi square test using SPSSExample : Suppose that we want to test whether or not customers have a colour preference for packaging. Three different colours, Blue, Green & Purple, are considered. The null hypothesis is that t
14、hey dont have colour preference.Use Analyse/Nonparametric tests /Chi-Square.The default is that the probabilities are equal.Main display colour2630.0-4.03730.07.02730.0-3.090BlueGreenPurpleTotalObserved NExpected NResidualNumbers of consumers actually choosing particular colours.Numbers of consumers
15、 expected to choose particular colours if the null is true.Main display colour2630.0-4.03730.07.02730.0-3.090BlueGreenPurpleTotalObserved NExpected NResidualDifferent but differentenough to reject the null? .Test Statistics2.4672.291Chi-SquareadfAsymp. Sig.Main DisplayColour0 cells (.0%) have expect
16、ed frequencies less than5. The minimum expected cell frequency is 30.0.a. Degrees of freedom,groups - 1Chi-square statistic.Test Statistics2.4672.291Chi-SquareadfAsymp. Sig.Main DisplayColourCheck this to test the null.Check the sig value to test Ho Cannot reject the null (Ho) that all three colours
17、 are equally preferredbecause Sig 0.05.Conclusion: At 5% significance level there is no sufficient evidence to conclude that consumers in the target population have preference for at least one of three colours of packaging. .Tests of independence Chi-squared test of a contingency tableThis test sati
18、sfies two different problem objectives :Are two nominal variables related? Are there differences among two or more population of nominal variables?Consider the following 3 featuresHeight in centimetres, Weight in kilograms & Colour of eyes.Whilst some people are tall and thin, on average taller
19、people weigh more than shorter people.Weight and height are not independent. It seems unlikely that people with blue eyes weigh more, on average, than people with brown eyes.Weight and eye colour are almost certainly independent.交叉分組下的頻數(shù)分析目的 了解不同變量在不同水平下的數(shù)據(jù)分布情況 例:學(xué)習(xí)成績與性別有關(guān)聯(lián)嗎?(兩變量)例:職業(yè)、性別、愛逛商店有關(guān)聯(lián)嗎?(三
20、變量)分析的主要步驟產(chǎn)生交叉列聯(lián)表分析列聯(lián)表中變量間的關(guān)系.產(chǎn)生交叉列聯(lián)表收入 職稱 高(人) 中(人) 低(人) 高工 工程師 助工 技術(shù)員 合計 什么是列聯(lián)表列變量行變量地區(qū)控制變量頻數(shù).產(chǎn)生交叉列聯(lián)表基本操作步驟(1)菜單選項: analyze-descriptive statistics- crosstabs(2)選擇一個變量作為行變量到row框.(3)選擇一個變量作為列變量到column框.(4)可選一個或多個變量作為控制變量到layer框.控制變量的層次設(shè)置:同層為水平數(shù)加水平數(shù)加;不同層為水平數(shù)積水平數(shù)積.(5)是否顯示各分組的棒圖(display clustered bar c
21、harts ).產(chǎn)生交叉列聯(lián)表進(jìn)一步計算 cells選項:選擇在頻數(shù)分析表中輸出各種百分比.row:行百分比(Row pct);column:列百分比(Col pct);total:總百分比(Tot pct); .分析列聯(lián)表中變量間的關(guān)系目的: 通過列聯(lián)表分析,檢驗行列變量之間是否獨(dú)立。方法: 卡方檢驗:對品質(zhì)數(shù)據(jù)的相關(guān)性進(jìn)行度量.分析列聯(lián)表中變量間的關(guān)系卡方檢驗 年齡與工資收入交叉列聯(lián)表 低 中 高 青 400 0 0 中 0 5000 老 0 0 600 低 中 高 青 0 0 500 中 0 6000 老 400 0 0.分析列聯(lián)表中變量間的關(guān)系卡方檢驗基本步驟(1)H0:行列變量之間無
22、關(guān)聯(lián)或相互獨(dú)立(2)構(gòu)造卡方統(tǒng)計量統(tǒng)計量服從(r-1)*(c-1)個自由度的卡方分布count:觀察(實際)頻數(shù)expected count:期望頻數(shù)(期望頻數(shù)反映的是H0成立情況下的數(shù)據(jù)分布特征)Residual:剩余(觀察頻數(shù)-期望頻數(shù))優(yōu)良中及格總數(shù)男1055323女8124125總數(shù)1817944837.535.418.88.3100eeofff22)(.不患肺癌不患肺癌患肺癌患肺癌總計總計不吸煙不吸煙7775427817吸煙吸煙2099492148總計總計98749199651、列聯(lián)表2、三維柱形圖3、二維條形圖不患肺癌患肺癌吸煙不吸煙不患肺癌患肺癌吸煙不吸煙080007000600
23、050004000300020001000從三維柱形圖能清晰看出從三維柱形圖能清晰看出各個頻數(shù)的相對大小。各個頻數(shù)的相對大小。從二維條形圖能看出,吸煙者中從二維條形圖能看出,吸煙者中患肺癌的比例高于不患肺癌的比例?;挤伟┑谋壤哂诓换挤伟┑谋壤?。通過圖形直觀判斷兩個分類變量是否相關(guān):通過圖形直觀判斷兩個分類變量是否相關(guān):.Tests of independence contExample 2Suppose we interviewed 400 people & asked themwhich of three age groups they are in (under 25, 25 t
24、o 60, and over 60).We also ask their response to the statement that “All imports of automobiles should be banned in order to protect the local industry” (agree, no view either way, disagree).attitudes towards banning importsagreeno viewdisagree Total age groupunder 2519 53 25 9725 - 6046 94 47 187ov
25、er 6030 56 30 116Total95203102 400.Tests of independence contExample 2 cont.Null hypothesis: The null hypothesis is that answers to the two questions are independent.Under the null:Probover 60 and agree = Probover 60 ProbagreeMultiplication rule for independent eventsExpected frequency= Probover 60
26、Probagree sample size.nCRnnCnREjijiijProcedureWe set up a cross-tabulation showing the observed frequencies of answers to the two questions.We calculate the expected frequencies.TestOur test is based on a comparison of the observed and expected frequencies.Short-cut for expected frequencies.Age *att
27、itude to banning imports Cross tabulation19.053.025.097.023.049.224.796.946.094.047.0187.044.494.947.7187.030.056.030.0116.027.658.929.6116.195.0203.0102.0400.095.0203.0102.0400.0CountExpected CountCountExpected CountCountExpected CountCountExpected CountUnder 2525-60Over 60AgeGroupTotalAgreeNo view
28、DisagreeAttitude to ban importsTotalCalculation for expectedfrequency of agree and over 60,95 116 / 400.Age *attitude to banning imports Cross tabulation19.053.025.097.023.049.224.796.946.094.047.0187.044.494.947.7187.030.056.030.0116.027.658.929.6116.195.0203.0102.0400.095.0203.0102.0400.0CountExpe
29、cted CountCountExpected CountCountExpected CountCountExpected CountUnder 2525-60Over 60AgeGroupTotalAgreeNo viewDisagreeAttitude to ban importsTotalThe count (observed) and the expected are different, but different enough to reject the null?.Chi-squared test for independenceE)EO(ij22)1c()1r (ijijRationale:Oij Eij HO is probably true.Test statisticWe requi
溫馨提示
- 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
- 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
- 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
- 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
- 5. 人人文庫網(wǎng)僅提供信息存儲空間,僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對任何下載內(nèi)容負(fù)責(zé)。
- 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請與我們聯(lián)系,我們立即糾正。
- 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時也不承擔(dān)用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。
最新文檔
- 影像科病理管理制度
- 心外科流程管理制度
- 快檢室設(shè)備管理制度
- 總公司貨運(yùn)管理制度
- 總經(jīng)理預(yù)約管理制度
- 慈善會培訓(xùn)管理制度
- 戰(zhàn)略部部門管理制度
- 排放瓦斯油管理制度
- 接種證查驗管理制度
- 收支結(jié)余率管理制度
- 血培養(yǎng)采集課件
- 廣東省茂名市直屬學(xué)校2023-2024學(xué)年七年級下學(xué)期期末數(shù)學(xué)試題
- 江西省九江市2023–2024學(xué)年八年級下學(xué)期期末考試道德與法治試題(無答案)
- 小學(xué)語文部編版六年級下冊全冊閱讀知識點(分單元課時編排)
- JBT 2231.3-2011 往復(fù)活塞壓縮機(jī)零部件 第3部分:薄壁軸瓦
- 2024-2030年中國果醬行業(yè)市場規(guī)模調(diào)研及前景趨勢預(yù)測報告
- 2024中車大連機(jī)車車輛限公司招聘高校畢業(yè)生170人高頻考題難、易錯點模擬試題(共500題)附帶答案詳解
- 2023年中移動家庭網(wǎng)關(guān)終端技術(shù)規(guī)范
- 2024年湖南省公安廳機(jī)關(guān)警務(wù)輔助人員招聘筆試參考題庫附帶答案詳解
- 2021年4月自考03200預(yù)防醫(yī)學(xué)二試題及答案含解析
- (新版)光伏產(chǎn)業(yè)技能競賽理論試題庫及答案(濃縮500題)
評論
0/150
提交評論